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METHOD FOR PREDICTING PROTEIN-PROTEIN INTERACTIONS 

FIELD OF THE INVENTION 

The present invention relates to a prediction for protein-protein interactions, a 
method and a device therefor, and proteins obtained using the method and device. 

BACKGROUND OF THE INVENTION 

Many proteins carry out their function by interacting with other proteins or the 
same protein. Thus, it is important to elucidate protein-protein interactions in the 
development of pharmaceuticals, the breeding in agriculture, and the like. Notably, along 
with the progress of genome analysis and cDNA analysis of various organisms including 
pathogenic microorganisms, the number of genes newly found and proteins encoded 
thereby whose functions are not known is rapidly increasing. Elucidating protein-protein 
interactions may permit one to predict the function of a protein whose function is not 
known. 

A conventional method that has been used to screen for a protein interacting with 
a certain protein so as to elucidate their interactions, is the so-called two-hybrid system 

- 

(Field, S. The two-hybrid system to detect protein-protein interaction. METHODS: A 
Companion to Meth. Enzymol., 5, 116-124, 1993). However, the two-hybrid system is a 
screening-based experiment, whose operation is complicated and time-consuming. Also, 
the number of proteins obtained is lower than that expected. In addition, this method has 
a disadvantage in that the results depend on the quality of the cDN A library used. In other 
words, this method has the risk that a gene encoding a protein interacting with a certain 
protein is not contained in the cDNA library used. 

On the other hand, protein databases based on genome analysis and cDNA 
analysis have been enhanced, such that a method has also been adopted wherein a protein 
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complex in a cell is subjected directly to MALDI-TOF mass spectrometry, followed by 
searching in the database for a fragment of the amino acid sequence thereof (Yates, JR 3rd, 
J. Mass Spectrom. 33, 1-19, 1998; Humphrey-Smith, L, et al., Electrophoresis, 18, 
1217-1242; Kaufmann, R, 1995, J. Biotechnol., 41, 155-175, 1997). This method gives 
information concerning proteins that form a complex, but does not give any information 
concerning the protein-protein interaction. Thus, it must be experimentally confirmed 
which proteins interact with each other. 

SUMMARY OF THE INVENTION 

In one embodiment, the present invention relates to a method for predicting a 
protein or polypeptide (B) that interacts with a specific protein or polypeptide (A), 
wherein the method is characterized by comprising: 

1) decomposing the amino acid sequence of protein or polypeptide (A) into a series of 
oligopeptides having a pre-determined length as sequence information; 

2) searching, within a database of protein or polypeptide amino acid sequences, for a 
protein or polypeptide (C) comprising an amino acid sequence for each member of the 
series or for a protein or polypeptide (D) comprising an amino acid sequence homologous 
to an amino acid sequence of each member of the series; 

3) carrying out local amino acid sequence alignment between said protein or polypeptide 
(A) and the detected protein or polypeptide (C) or detected protein or polypeptide (D); 
and 

4) predicting whether the detected protein or polypeptide (C) and/or protein or 
polypeptide (D) is a protein or polypeptide (B) that interacts with the protein or 
polypeptide (A) based on the results of the local amino acid sequence alignment and a 
value calculated from a frequency of an amino acid and/or a frequency of said 
oligopeptides in said amino acid sequence database. 
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One embodiment of the present invention may be the above-mentioned method 
for prediction wherein the oligopeptide is 4-1 5 amino acids in length. 

In addition, in another embodiment, the present invention relates to a recording 
medium carrying a program to predict a protein or polypeptide (B) that interacts with a 
specific protein or polypeptide (A) 5 comprising at least the following means a) to f): 

a) a means for inputting amino acid sequence information of the protein or polypeptide 
(A) and storing the information; 

b) a means for decomposing the above-mentioned information into a series of 
oligopeptides having a pre-determined length as sequence information, and a means for 
storing the sequence information consequently obtained; 

c) a means for storing an input protein database; 

d) a means for accessing the stored protein database and detecting a protein or 
polypeptide (C) having an amino acid sequence of said oligopeptide or a protein or 
polypeptide (D) having an amino acid sequence homologous to the amino acid sequence 
of said oligopeptide, and a means for storing and calculating a detected result; 

e) a means for carrying out local alignment between the protein or polypeptide (A) and 
the detected protein or polypeptide (C) or protein or polypeptide (D), and a means for 
storing and calculating a result; and 

f) a means for obtaining a resultant value of a frequency of an amino acid and/or a 
frequency of said oligopeptide from a protein database, followed by showing an index for 
predicting protein-protein interactions from the resultant value and a resultant value of 
said local alignment, and a means for storing and displaying the result and consequently 
detecting protein or polypeptide (B) which interacts with the protein or polypeptide (A). 

In a further embodiment, the present invention relates to a recording medium 
comprising at least one of the following means g) to 1) in addition to the means a) to f): 
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g) a means for ranking strength of protein-protein interactions among detected proteins or 
polypeptides (B) based on the indexes calculated from a resultant value of local alignment 
and a resultant value of a frequency of an amino acid and/or a frequency of an 
oligopeptide in a protein database in the case that more than one protein or polypeptide 
(B) exist that are detected, and a means for storing and displaying the result; 

h) a means for displaying full-length of amino acid sequences of the protein or 
polypeptide (A) and the protein or polypeptide (B) that is detected, followed by indicating 
a location of partial sequence to be aligned in the full-length sequence in the case that 
amino acid partial sequences are aligned by local alignment between the protein or 
polypeptide (A) and the protein or polypeptide (B); 

i) a means for calculating a stereo structure model in the case that a stereo structure of the 
protein or polypeptide (A) or the protein or polypeptide (B) that is detected is known or in 
the case that homology modeling enable to make a stereo structure model, followed by 
displaying the structure of the amino acid partial sequences that are aligned by local 
alignment between the protein or polypeptide (A) and the protein or polypeptide (B) on 
the stereo structure; 

j) a means of classifying proteins in a protein database to narrow a searching area and 
storing the same; 

k) a means for serially inputting each protein in a protein database as the protein or 
polypeptide (A); and 

1) a means for storing a genome database. 

. In still another embodiment, the present invention relates to a device for 
predicting protein-protein interactions comprising the means that are carried by the 
above-mentioned recording medium. 

In an additional embodiment, the present invention relates to a method for 
specifying proteins or polypeptides that interact with each other, which comprises 



identifying a protein or polypeptide (B) that is predicted to interact with a specific protein 
or polypeptide (A) by the above-mentioned prediction method or prediction device, and 
then experimentally confirming the presence of the interaction between the protein or 
polypeptide (A) and the protein or polypeptide (B). 

Furthermore, in another embodiment, the present invention relates to a protein or 
polypeptide that is specified by the above method. 

In still another embodiment, the present invention relates to a method of 
screening for a compound that is capable of controlling the interaction of a specific 
protein or polypeptide (A) with a protein or polypeptide (B) utilizing the 
above-mentioned prediction method or prediction device. 

In yet another embodiment, the present invention relates io a novel compound 
obtained by the screening method and a novel compound capable of controlling the 
interaction of the protein or polypeptide (A) with the protein or polypeptide (B) obtained 
by drug design based on information of the compound obtained. 

In another embodiment, the present invention relates to an oligopeptide 
comprising amino acid sequence SEQ ID No: 1 which is capable of controlling the 
interaction of verotoxin 2 (VTII) with Bcl-2, or an oligopeptide that comprises an amino 
acid sequence homologous to the oligopeptide and is capable of controlling the 
interaction of VTII with Bcl-2, or a polypeptide that contains any of these oligopeptides 
and is capable of controlling the interaction of VTII with Bcl-2. 

In addition, in one embodiment, the present invention relates to an agent against 
cell death comprising an oligopeptide comprising amino acid sequence SEQ ID NO: 1. 

In still another embodiment, the present invention relates to a method of 
screening for a compound capable of controlling interaction of VTII with Bcl-2, wherein 
the method utilizes the above-mentioned oligopeptide and/or the above-mentioned 
polypeptide. 
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In yet another embodiment, the present invention relates to a method for 
determining a sequence of an oligonucleotide coding an oligopeptide involved in 
interaction of a specific protein or polypeptide (A) with a protein or polypeptide (B) that 
is predicted to interact with the protein or polypeptide (A), wherein the method uses the 
above-mentioned prediction method or the above-mentioned prediction device. 

In a further embodiment, the present invention relates to a series of combinations 
of human proteins, which are predicted to interact with each other, identified by the 
above-mentioned prediction method or the above-mentioned prediction device. 

In addition, in an embodiment, the present invention relates to a method for 
selecting a combination of proteins having a protein-protein interaction that is related to a 
disease, wherein the method comprises selecting the combination based on the 
information of a known protein that is related to the disease from the above-mentioned 
series of combination of proteins. 

Further in another embodiment, the present invention relates to a series of 
combinations of proteins having protein-protein interaction that are related to diseases, 
and which are obtained by the above-mentioned method. 

In yet another embodiment, the present invention relates to a method of 
screening for a compound that controls the interaction of a certain combination and/or 
two proteins further selected from the series of combinations of proteins having a 
protein-protein interaction that are related to diseases obtained as mentioned above. 

In a still further embodiment, the present invention relates to a compound 
identified by the method of screening for a compound which controls the interaction. 

In yet another embodiment, the present invention relates to a method for 
predicting a processing site of a protein by predicting the protein-protein interaction of a 
specific protein with an enzyme cleaving said protein using the above-mentioned 
prediction method or device. 
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In addition, in one embodiment, the present invention relates to an amino acid 
sequence that contains a protein-processing site obtained by the above-mentioned 
prediction method for a protein-processing site, and/or an amino acid sequence that 
contains a partial sequence homologous to the processing site. 

BRIEF DESCRIPTION OF DRAWINGS 

Fig. la illustrates 20 amino acid residues from the amino terminal end of 
vero toxin 2. 

Fig. 1 b illustrates oligopeptides, each having an amino acid sequence length of 5 
residues, which were obtained by decomposing the amino acid sequence consisting of 
said 20 residues as the sequence information using a program. 

Fig. lc illustrates oligopeptides, each having an amino acid sequence length of 6 
residues, which were obtained by decomposing the amino acid sequence consisting of 
said 20 residues as the sequence information using a program. 

Fig. 2 illustrates oligopeptides, each having an amino acid sequence length of 5 
residues, which were obtained by decomposing the 1 3 residues from the amino terminal 
end of verotoxin 2 as the sequence information using a program, and human proteins 
comprising the amino acid sequence of the oligopeptides. 

Fig. 3 illustrates the result of local alignment whereby oligopeptides that 
comprise a portion of verotoxin 2 (VTII) and human □ -adrenergic receptor kinase 2 
(ARJC2) were obtained. 

Fig. 4a illustrates the frequency of each amino acid in protein synthesis of 
Escherichia coli. 

Fig. 4b illustrates the percentage that each amino acid is present in protein 
synthesis of Escherichia coli. 

Fig. 5 is a simplified flow of means. 
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Fig. 6 illustrates amino acid sequences of oligopeptides derived from verotoxin 2 
which are also present in human proteins that are related to cell death, and the 

■ 

corresponding human proteins. 

Fig. 7 illustrates the result of local alignment whereby oligopeptides that 
comprise a portion of verotoxin 2 (VTII) and Bcl-2, were obtained. 

Fig. 8 illustrates the result of local alignment whereby oligopeptides that 
comprise a portion of verotoxin 2 (VTII) and Bcl-xL, were obtained. 

Fig. 9 illustrates the result of local alignment whereby oligopeptides that 
comprise a portion of verotoxin 2 (VTII) and MCL-1, were obtained. 

Fig. lOa-b illustrate the result of local alignment whereby oligopeptides that 
comprise a portion of verotoxin 1 (VTI) and Bcl-2 (Fig. 10a) or Bcl-xL (Fig. 10b), were 
obtained. 

Figs, lla-b illustrate electrophoretic patterns showing the result of 
confirmational experiments using HepG2 cells and BIO cells showing that verotoxin 2 
(VTII) and Bcl-2 interact with each other ((Fig 11a) and (Fig lib) right), but that 
verotoxin 1 (VTI) and Bcl-2 do not interact with each other ((Fig. lib) left). In the 
figures, Bcl-2 IPs and VTII IPs indicate that Bcl-2 and VTII were immunoprecipitated by 
anti-Bcl-2 antibody and anti-VTII antibody, respectively. Fig. 11a (left) illustrates the 
result of western blotting using anti-Bcl-2 antibody (Bcl-2 WB). Fig. 11a (right) 
illustrates the result of western blotting using anti-VTII antibody (VTII WB). Fig. lib 
illustrates electrophoretic patterns showing the result of confirmation of the subcellular 
fraction of BIO cells that were treated with verotoxin 1 (VTI) (left) or verotoxin 2 (VTII) 
(right), where these proteins were detected using anti-VTI antibody and anti-VTII 
antibody, respectively. 

Fig. 12 illustrates the sites which correspond to the local alignment of verotoxin 
2 (VTII) and Bcl-2. 
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Fig. 13 illustrates, using a wire model, the portion that is homologous to the 
partial sequence of verotoxin 2 (VTII) on the stereo structure of Bcl-xL. 

Fig. 14 illustrates, using a wire model, the portion that is homologous to the 
partial amino acid sequence of Bcl-xL on the stereo structure of verotoxin 2 that is 
constructed by homology modeling. 

Fig. 15 illustrates that oligopeptide NWGRI which comprises a portion of 
verotoxin 2 (VTII) and Bcl-2, suppresses cell death induced by VTII in a dose dependent 
manner of NWGRI. 

Figs. 16a-b illustrate the result of local alignment whereby oligopeptides that 
comprise a portion of human helper T cell surface protein CD4 and HIV-1 virus surface 
protein gpl20, were obtained. Fig. 16a illustrates oligopeptides that comprise a portion of 
CD4 and gpl20. Fig. 16b illustrates amino acid sequences of a region having a high local 
homology in CD4 and gpl20. 

Fig. 17 illustrates the result of local alignment whereby oligopeptides that 
comprise a portion of CED-4 (a cell death-related protein of nematode) and MAC-1 
protein (which binds to CED-4), were obtained. 

Fig. 18 illustrates the result of local alignment whereby oligopeptides that 
comprise a portion of amyloid precursor protein (APP) and BASE (an enzyme which 
cleaves the protein), were obtained. 

Figs. 19a-b illustrate the result of local alignment whereby oligopeptides that 

* 

comprise a portion of furin-precursor protein (furin-pre) and von Willebrand factor 
precursor protein (VWF-pre), were obtained. Fig. 19a illustrates oligopeptides that 
comprise a portion of both proteins. Fig. 19b illustrates an amino acid sequence of a 
region having a high local homology in both proteins. 

Fig. 20 illustrates the result of local alignment whereby oligopeptides that 
comprise a portion of amyloid precursor protein (APP) and protein PC7 (which is 
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considered to be involved in the processing thereof), were obtained. In the figure, symbol 
"=" indicates a site that is predicted to be a cleavage site. 



BRIEF DESCRIPTION OF REFERENCE NUMERALS 

a Means for inputting 

b Means for decomposing into a series of oligopeptides and storing the same 

c Means for storing 

d Means for searching and storing 

e Means for carrying out local alignment and storing 

f Frequency-calculating/memory-displaying means 

g Ranking/memory-displaying means 

h Location-displaying means 

i Stereo structure-calculating/memory-displaying means 

j Means for classifying proteins and storing 

k Sequentially inputting means 

1 Means for storing 

m Keyboard 

n Controlling means 

o Outputting means 



DETAILED DESCRIPTION OF THE INVENTION 

The embodiments of the present invention will be described in more detail below 
as well as the principle and method of the present invention, a recording medium that 
carries a program for carrying out the method, a device that works the function, proteins 
and polypeptides that are obtained by the method and device. The following description 
is given only for illustration, and it not intended to limit the present invention. 
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Technical and scientific terms used in the specification have the meanings 
usually understood by one of ordinary skill in the art to which the present invention 
pertains, unless otherwise defined. Reference is made herein to various methodologies 
known to those of ordinary skill in the art. Publications and other materials setting forth 
such known methodologies to which reference is made are incorporated herein by 
reference in their entireties. 

In one embodiment, the present invention relates to a method for predicting 
protein-protein interactions, which is based on the following idea: protein is composed of 
a sequence consisting of 20 kinds of amino acids, but these amino acids are not randomly 
placed. Therefore, it is considered that an oligopeptide that is a partial sequence of a 
protein has a role in a species of a living organism. 

For example, an oligopeptide that is a part of a certain enzyme is considered to 
play a role in recognizing a substrate. In another protein, an oligopeptide that plays an 
important role in interacting with other proteins is considered to exist. In this way, it is 
necessary to consider the function or interaction of a protein from the oligopeptide level. 
In addition, from the viewpoint of frequency of the oligopeptide, the frequency of the 
appearance of certain oligopeptides in all of the proteins encoded by the genome in one 
organism is not even. Some oligopeptides frequently occur in various proteins; others do 
so only rarely. It is very likely that an oligopeptide that occurs with low frequency is an 
oligopeptide that is unique to each protein. Such an oligopeptide might determine the 
feature or function of the protein. 

On the other hand, the fact that proteins interact with each other implies that the 
interacting proteins perform a function in cooperation with each other whereby the 
organism carries on its biotical activity. If it is assumed that one oligopeptide corresponds 
to one function, two proteins that interact with each other might have the same 
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oligopeptide or homologous oligopeptides. In addition, these two proteins might have 
homologous sequence structure in a part other than the oligopeptide that is the same. 

As one of the techniques of similarity search for analyzing homology of two 
proteins, a method of comparison by aligning the primary structures of both proteins is 
known (Minoru Kanahisa "Introduction to genome informatics (in Japanese)" Kyoritsu 
Publishing Co., Ltd., 93-104, 1996). This sequence alignment includes 'global 
alignment 5 and iocal alignment 5 . The 'global alignment' comprises aligning the entire 
sequences, and the iocal alignment 5 comprises locally aligning only homologous parts 
extracted from their sequences. In any alignment, the alignment is carried out so that the 
relation between/among two sequences or more can be as clear as possible. Many 
combinations exist in the alignment depending on the length of the sequence. Methods 
for carrying out combinatorial optimization include the dynamic programming method. 
The Smith- Waterman method (Smith, TF and Waterman, MS, J. Mol. Biol. 147, 195-197, 
1981) that is based on the principle that dynamic programming gives an estimation 
function on the combinatorial optimization of sequences. A value of the estimation 
function, i.e., 'homology score 5 or 'score 5 permits estimating homology between two 
proteins. It can be estimated that the higher the score between proteins that were 
compared, the higher the homology between these proteins. As for the local alignment, it 
is carried put by setting a threshold to the score, followed by carrying out combinatorial 
optimization of partial sequences, when the combination of sequences is searched by 
dynamic programming (e.g., Gotoh, 0. 3 Pattern matching of biological sequences with 
limited storage, Comput. Appl. Biosci. 3, 17-20, 1987). The local alignment method 
permits searching the homologous structure in a part of the protein other than where the 
oligopeptide that is the same portion to two proteins is located. 

Protein-protein interactions might have been conserved in the process of 
evolution. The case of verotoxin of Escherichia coli and Bcl-2, as described later, implies 
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that one function has been conserved in the protein-protein interaction beyond species, 
whence structurally similar amino acid sequences might exist. In addition, in the 
processing of amyloid precursor protein and von Willebrand Factor (VWF) precursor 
protein, as described later, the function of proteolysis might have been conserved in the 
protein-protein interaction, and structurally similar amino acid sequences might exist. 

A method for predicting protein-protein interactions that is created based on the 
above idea may also permit predicting a network of functions that has been known only as 
a single function in the past and describing a new image of life based on the results that 
were predicted on a computer and on the overall relation of actions that is different from 
the image of life being reached by the accumulation of facts that have been obtained by 
the enumeration principle of molecular biology. 

In addition, if the prediction for interactions is possible not in one organism but 
between two organisms, e.g., human beings and pathogenic microorganisms, the 
elucidation of the pathogenesis that has not been known so far might become possible. 

Concretely, one embodiment of the above method for predicting protein-protein 
interactions is a method of extracting and predicting a counter-protein from a protein 
database and the like, which interacts with a protein that was obtained by genome 
analysis or cDNA analysis whose function is unknown or a protein whose function is 
known, wherein the method comprises, for example, the following steps 1-4: 

In step 1, an amino acid sequence of a specific protein or polypeptide is 
decomposed into a series of oligopeptides having a pre-determined length as the sequence 
information. In step 2, proteins or polypeptides are determined which comprise each 
oligopeptide. In step 3, homology of partial structures between the proteins is estimated 
by local alignment. In step 4, each oligopeptide is further evaluated by a frequency of 
occurrence. 

Each step will be described below more in detail: 
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Step 1 : 

The amino acid sequence of a specific protein or polypeptide (A), such as a 
protein or polypeptide that is obtained by genome analysis or cDNA analysis and whose 
function is not known or a protein or polypeptide whose function is known, is 
decomposed into oligopeptides as sequence information by shifting, serially, by one 
amino acid residue from the amino terminal end to the carboxyl end. 

For example, Figs, la-b illustrate oligopeptides (Fig. lb) having an amino acid 
length of 5 residues that were obtained by decomposing the first 20 residues (Fig. la) of 
verotoxin 2 (VTII) of Escherichia coli 0157:H7 from its amino terminal end as sequence 
information. Amino acids and oligopeptides are given in their one-letter symbols 
hereafter. 

When step 1 is carried out, the amino acid length of oligopeptide that is 
decomposed as the sequence information is 4 to 15 residues, preferably 4 to 8 residues. 
The longer the length of an oligopeptide, the greater the particularity of the oligopeptide, 
as shown in Example 2. 
Step 2: 

In step 2, a protein or polypeptide (C) comprising an amino acid sequence of an 
oligopeptide that was obtained by the decomposition in step 1 or a protein or polypeptide 
(D) having an amino acid sequence that is homologous to the oligopeptide is searched for 
in an amino acid sequence database of proteins or polypeptides. The number of detected 
proteins or polypeptides (C) or (D) can be large or can be one depending on the 
oligopeptide used. 

For example, Fig. 2 illustrates the results of searching for proteins having 9 
oligopeptides each consisting of 5 amino acids obtained by decomposing 13 residues of 
verotoxin 2 (VTII) of Escherichia coli 01 57:H7 from the amino terminal end, in a protein 
database (SWISS-PROT version 35). Verotoxin 2 causes food poisoning and/or renal 
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damage in human beings, so that human protein can be targets for searching for 
counter-proteins that interact with VTII. For example, such a search shows that a human 
protein comprising oligopeptide KCILF shown in Fig. 2 (second) is □ -adrenergic 
receptor kinase 2 (ARK2 HUMAN). 
Step 3 : 

Local alignment is carried out between the above protein or polypeptide (A) and 
the protein or polypeptide (C) or protein or polypeptide (D) that is obtained in the search 
in Step 2. 

For example, Fig. 3 illustrates the result of local alignment between verotoxin 2 
(VTII) and D-adrenergic receptor kinase 2 (ARK2). 
Step 4: 

If the result of the local alignment in step 3 shows any homology of partial 
sequence between the above protein or polypeptide (A) and the detected protein or 
polypeptide (C) and/or protein or polypeptide (D), the protein or polypeptide (C) and/or 
protein or polypeptide (D) are/is predicted to possibly be a protein or polypeptide (B) that 
interacts with protein or polypeptide (A). Moreover, the frequency of amino acid(s) 
and/or the frequency of oligopeptide that is present in both protein or polypeptide (A) and 
the detected protein or polypeptide (C) and/or protein or polypeptide (D) is calculated 
from a protein database, followed by evaluating the particularity of each oligopeptide in 
the protein database or in the genome of an organism having the protein or polypeptide 
(A) or the detected protein or polypeptide (C) and/or protein or polypeptide (D). If the 
particularity is high, the reliability of the prediction is evaluated to be high that the protein 
or polypeptide (C) and/or protein or polypeptide (D) are/is a protein or polypeptide (B) 
that interacts with the protein or polypeptide (A). 

For example, an index of particularity of the above oligopeptide KCILF is 
calculated to be 1284.86xl(T 10 from the composition ratio (Fig. 4b) that is calculated 
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from the frequency of amino acid (Fig. 4a) in all of the proteins encoded by the genome of 
Escherichia coli shown in Fig. 4, so that the particularity is high. An oligopeptide 
consisting of 5 amino acids that is calculated to have low particularity in the E. coli 
genome is LLLLL, i.e., the particularity index is 136344.34 x 10" 10 . An oligopeptide 
consisting of 5 amino acids that is calculated to have the highest particularity is CCCCC, 
with a particularity index of 2.208* 10" 10 , but the oligopeptide is not found in the E. coli 
genome. Therefore, the prediction that verotoxin 2 interacts with □ -adrenergic receptor 
kinase 2 is evaluated to have high reliability from the value of the particularity index. 

In order to confirm further the protein-protein interaction, the gene encoding the 
protein may be cloned for expression based on the information of the obtained proteins 
that interact with each other. For example, as described in the examples later, VTII and 
Bcl-2 were predicted to interact with each other by the above predicting method or a 
predicting device carrying a program for the predicting method. This could be confirmed 
by experiments wherein cells in which Bcl-2 is expressed and cells in which Bcl-2 is not 
expressed were treated with VTII, followed by co-immunoprecipitating with anti-Bcl-2 
antibody and anti-VTII antibody. In addition, it is possible to specify the oligopeptide as 
an important interacting site, for example, by introducing a mutation by a well-known 
method into the amino acid sequence of an oligopeptide that is predicted to be an 
interacting site, followed by confirming that the interaction is lost. The method for 
confirming interactions experimentally is not limited to the above ones, but any of 
techniques that are applicable by those skilled in the art may be used. 

In addition, if it is confirmed that an oligopeptide, which was predicted to be an 
interacting site, interrupts a protein-protein interaction and suppresses any function or 
action of the protein, then such an oligopeptide can be utilized as an agent for suppressing 
the action of the protein. For example, as described in an example below, oligopeptide 
NWGRI, which was predicted to be the interacting site for VTII and Bcl-2, suppresses 
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cell death induced by VTII and can be used as an agent against cell death. Such a 
low-molecular-weight compound can be utilized as pharmaceuticals, reagents, and the 
like. 

Next, a recording medium and device that carry a program for the above method 
for predicting protein-protein interactions will be described. The above recording 
medium and device comprise at least the following means (a) to (f). Fig. 5 illustrates an 
example of the constitution. 
Inputting means (a): 

A means for inputting the amino acid sequence information concerning a 
specific protein or polypeptide (A) such as a protein or polypeptide, which was obtained 
by genome analysis or cDNA analysis, whose function is not known or protein or 
polypeptide whose function is known. 

Means for decomposing into a series of oligopeptides and storing the same (b): 

A means for decomposing the amino acid sequence information that was input 
by inputting means (a) into a series of oligopeptides having a pre-determined length as 
sequence information by shifting, serially, by one amino acid residue from the amino 
terminal end to the carboxyl end, and storing the result. 
Storing means (c): 

A means for storing a database that was input in concerning with a protein or 
polypeptide. 

Searching/storing means (d): 

A means for accessing a database concerning a protein or polypeptide that is 
stored in storing means (c), followed by searching for a protein or polypeptide (C) 
comprising the amino acid sequence of the above oligopeptide or a protein or polypeptide 
(D) comprising an amino acid sequence that is homologous to the amino acid sequence of 
the above oligopeptide, and storing the result. 
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Carrying out local alignment/storing means (e): 

A means for carrying out local alignment between the above protein or 
polypeptide (A) and the detected protein or polypeptide (C) and/or (D), and storing the 
result. 

Frequency-calculating/memory-displaying means (f): 

A means for calculating an index for predicting protein-protein interactions from 
the result of the above local alignment and the result obtained after calculating a 
frequency of an amino acid and/or a frequency of an oligopeptide in a peptide or 
polypeptide database, and storing and displaying the result. 

In addition, in the above program for predicting protein-protein interaction, it is 
also possible to comprise the following means (g) to (1) in an appropriate combination: 
Ranking/memory-displaying means (g): 

A means having a function of ranking proteins or polypeptides (B) 5 when more 
than one protein or polypeptide (B) is detected, by using the result of the local alignment 
and the result of the calculation of a frequency of an amino acid and/or a frequency of an 
oligopeptide from a protein database as indexes and a function of storing/displaying the 
result. 

Location-indicating means (h): 

A means for displaying full-length amino acid sequences of the protein or 
polypeptide (A) and the protein or polypeptide (B) followed by indicating a location of 
partial sequence to be aligned in the full-length sequences in the case that amino acid 
partial sequences are aligned between the protein or polypeptide (A) and the detected 
protein or polypeptide (B). 

Stereo structure-calculating/memory-displaying means (i): 

A means for calculating a stereo structure model followed by displaying the 
structure of the amino acid partial sequences that are aligned between the protein or 
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polypeptide (A) and the protein or polypeptide (B) on the stereo structure in the case that a 
stereo structure of the protein or polypeptide (A) or the protein or polypeptide (B) that is 
detected is known or in the case that a stereo structure model can be constructed by 
homology modeling. 
Protein-classifying/storing means (j): 

A means having a function of classifying proteins or polypeptides in a protein or 
polypeptide database by feature, function, and/or origin to narrow a searching area 
followed by storing them. 
Sequentially inputting means (k): 

A means for sequentially inputting each protein or polypeptide in a protein or 
polypeptide database as the protein or polypeptide (A). 
Storing means (1): 

A means having a function of storing a genome database. 

The above means are carried on an appropriate medium. 

As one embodiment of use, each of these means may be provided as a device 
containing a recording medium selectively carrying it as a program. A device for 
predicting protein-protein interactions is operated as described below (see Fig. 5). 

A specific protein or polypeptide (A), such as a protein or peptide which was 
input by an inputting means (a) and whose function is unknown or known, is decomposed 
by means for decomposing into a series of oligopeptides and storing the same (b) into a 
series of oligopeptides having a pre-determined length as sequence information, and the 
oligopeptides are stored. In this case, the protein or polypeptide (A) is sequentially input 
by a sequentially inputting means (k) from a means (c) that stores a database that was 
input concerning the protein or polypeptide, when desired. A search is carried out 
through a means (c) for the amino acid sequence of the above-stored oligopeptide by a 
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searching/storing means (d) 5 and a protein or polypeptide (C) comprising the amino acid 
sequence of the oligopeptide or a protein or polypeptide (D) comprising an amino acid 
sequence that is homologous to the amino acid sequence of the oligopeptide is detected 
and stored. When searching, it is also possible to classify proteins or polypeptides in a 
database to narrow the searching area by a protein-classifying/storing means (j), followed 
by searching within the resultant area. The above protein or polypeptide (A) and the 
detected protein or polypeptide (C) or (D) are subjected to local alignment by a locally 
aligning/memory-displaying means (e) 5 and the result is stored. Next, by a 
frequency-calculating/storing means (f), the frequency of an amino acid and/or the 
frequency of the oligopeptide are/is calculated from a database that was stored on a means 
(c), and an index for predicting protein-protein interactions is calculated from the result 
and the above-obtained result of the local alignment and stored. Then, those obtained are 
displayed on the screen of the device, which are the protein or polypeptide (C) or (D) that 
are predicted to interact with the above protein or polypeptide (A), an amino acid 
sequence of an oligopeptide that is the same in these proteins, a frequency of the 
oligopeptide, indexes for predicting protein-protein interactions, and so on. Displayed 
results permits giving a protein or polypeptide (B) that has interaction with the above 
protein or polypeptide (A) based on the indexes for predicting protein-protein interactions. 
In addition, concerning the above protein or polypeptide (B), it is also possible to display 
the functional information of a protein that is stored on a means (c) and the gene 
information from a means (1) equipped when desired that stores a genome database. 
When more than one protein or polypeptide (B) is detected, a ranking/memory-displaying 
means (g) permits ranking the protein or polypeptide (B) in order of the particularity to 
interact with the above protein or polypeptide (A). It is also possible to indicate by a 
location-indicating means (h) which part of the full-length amino acid sequences of the 
above protein or polypeptide (A) and the detected protein or polypeptide (B) is the partial 
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amino acid sequence that is aligned between the protein or polypeptide (A) and the 
protein or polypeptide (B). In addition, it is also possible to display a stereo structure of 
the protein or polypeptide (A) and the protein or polypeptide (B), as well as the part that is 
aligned between the protein or polypeptide (A) and the protein or polypeptide (B) by a 
stereo structure-calculating/memory-displaying means (i). This device may be equipped 
with keyboard (m) 5 controlling means (n), outputting means (o), as shown also in Fig. 5, 
and so on as well as these means (a) - (1). 

The above method for predicting protein-protein interactions or the above 
prediction device can further be used for screening for a novel compound that controls the 
interaction of a specific protein or polypeptide (A) with a protein or polypeptide (B). The 
above method of screening for a novel compound that controls the interaction of a 
specific protein or polypeptide (A) with a protein or polypeptide (B) is carried out based 
on the information of the amino acid sequence of a key oligopeptide. An amino acid 
sequence of a selected oligopeptide, an amino acid sequence of an oligopeptide 
homologous thereto, or a polypeptide comprising the amino acid sequence or the 
homologous amino acid sequence per se can be capable of controlling the interaction of 
the protein or polypeptide (A) with the protein or polypeptide (B). For example, in the 
case that the protein or polypeptide (B) having a receptor function to the protein or 
polypeptide (A) is in existence, it is likely that an oligopeptide that is screened by the 
above technique is antagonistic to the interaction of the protein or polypeptide (A) with 
the protein or polypeptide (B). For example, in the case that the protein or polypeptide 
(A) is activated by the interaction with the protein or polypeptide (B), it is likely that an 
oligopeptide that is screened by the above technique has a function as an agonist. 

Concretely, as described in detail in the examples below, it was experimentally 
confirmed that an oligopeptide NWGRI described in SEQ ID NO: 1, which comprises a 
portion of VTII and Bcl-2, that were predicted and experimentally confirmed to interact 
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with each other by the present invention, interrupts complex formation due to the 
interaction of VTII with Bcl-2, and suppresses cell death induced by VTIL Therefore, 
N WGRI oligopeptide can be used as a medicament for controlling a disease related to cell 
death induced by VTII, for example as a medicament for treating a disease caused by 
Escherichia coli 0157 expressing VTII, more concretely as an agent against cell death. 
Moreover, an oligopeptide having an amino acid sequence homologous to the amino acid 
sequence and capable of controlling the interaction of VTII with Bcl-2, or a polypeptide 
comprising the amino acid sequence or an amino acid sequence homologous to the amino 
acid sequence and capable of controlling the interaction of VTII with Bcl-2, can also be 
used as a medicament for controlling a disease related to cell death induced by VTII. In 
addition, a novel compound capable of controlling the interaction of VTII with Bcl-2 can 
be obtained utilizing these oligopeptides and polypeptides by the drug design method or 
by applying of a known screening method. 

In this way, a novel compound, which is obtained by drug design based on the 
information of an oligopeptide that is obtained by the above screening method according 
to an embodiment of the present invention, is capable of controlling the interaction of a 
specific protein or polypeptide (A) with a protein or polypeptide (B). Namely, to predict 
interaction of the above protein or polypeptide (A) with the above protein or polypeptide 
(B) permits one to make a derivative of the oligopeptide obtained by the above screening 
method and a low-molecular-weight compound having a structure homologous to the 
oligopeptide by a well-known drug design technique. 

The above prediction method is also very useful for a method for determining 
the sequence of the oligonucleotide coding an oligopeptide involved in interaction of a 
specific protein or polypeptide (A) with a protein or polypeptide (B). Applying 
well-known methods such as substitution, deletion, addition, insertion, or induced 
mutation based on this information permits one to obtain a useful oligonucleotide. The 
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obtained oligonucleotide can be used for obtaining a compound for controlling the 
interaction of protein or polypeptide (A) with protein or polypeptide (B) on a gene level. 
For example, it is utilized for making an antisense oligonucleotide to interrupt the 
protein-protein interaction. In addition, the obtained oligonucleotide can be used for 
diagnosing a disease that is related to the protein-protein interaction. 

In another embodiment, the present invention relates to a series of combinations 
of human proteins that are predicted to have protein-protein interactions, which are 
predicted by the above method or device for predicting protein-protein interaction. The 
series of combination of proteins can be provided as a catalogue or as a database. A series 
of combination of proteins which interact with each other that are involved in a disease 
can be obtained by selecting ones having protein-protein interactions that are related to a 
disease based on the information of known proteins that can be related to the disease from 
the series of combination of proteins having protein-protein interactions. These can be 
provided as a catalogue or as a database. These combinations of proteins are useful as a 
medicament for treating or preventing diseases or as ways to obtain medicaments. For 
example, a compound capable of controlling interaction of two proteins can be obtained 
by screening using a well-known screening method and by utilizing a combination of 
proteins that is obtained. 

In the case that among combinations consisting of two proteins having a 
protein-protein interaction, one protein is an enzyme capable of processing protein and 
cleaves the other protein, the processing site of the protein that is cleaved can be predicted 
by the above method or device for predicting protein-protein interactions. 

For example, as shown in an example below, prediction could be accomplished 
on the subject of the interaction of an amyloid precursor protein with an enzyme that is 
involved in its processing, and on the subject of the interaction of von Willebrand factor 
precursor protein with an enzyme furin that is involved in its processing. Namely, the 
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above method or device for predicting protein-protein interactions permits predicting a 
cleavage site when a precursor protein is cleaved to act as a mature protein. In this way, a 
hitherto-unknown enzyme having a protein-processing action related to a disease and a 
protein that is cleaved by the enzyme can be obtained by predicting protein-protein 
interactions. 

Examples 

Although advantages, features, and possible applications of the present 
invention are described below in greater detail with reference to exemplary embodiments, 
the present invention is not limited to the following examples. In addition, although 
SWISS-PROT version 35 was used as a protein database in the following examples, other 
protein databases or the like can also be used. 

Example 1 

Figs, la-c illustrate oligopeptides that were decomposed from the first 20 
residues (Fig. la) of verotoxin 2 (VTII) of Escherichia coli 0157:H7 from the amino 
terminal end, where the oligopeptides have an amino acid sequence length of 5 resides 
(Fig. lb) as an example of step 1. Fig. lc illustrates oligopeptides that were decomposed 
from the first 20 residues of verotoxin 2 (VTII) from the amino terminal end, where the 
oligopeptides have an amino acid sequence length of 6 residues. 

Example 2 

In step 4 of the above method for predicting protein-protein interactions, values 
are used as an index for predicting the interaction of proteins or polypeptides. The values 
are calculated from the frequency of an amino acid in a protein or polypeptide database 
and the frequency of an oligopeptide in the protein or polypeptide database. By way of 
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example, the particularity of oligopeptides is calculated from the frequency of the amino 
acid in all of the proteins encoded by the genome of Escherichia coli shown in Figs. 4a-b. 
The percentage 'Ai 5 of each amino acid 'ai 5 can be calculated to be as shown in Fig. 4b 
from the frequency of occurrence (Fig. 4a) of the 20 kinds of amino acids in all of the 
proteins encoded by the genome of Escherichia coli. 

The particularity of oligopeptide ala2a3a4a5 is calculated to be 
AI *A2*A3 X A4*A5. For example, in the case of oligopeptide KCILF, it is calculated to 
be 4.406610x1. 170608x6.004305xl0.639652x3.898962xl0' 10 . The particularity of 
oligopeptide LLLLL is calculated to be 136344.34xl0~ 10 , and the particularity of 
oligopeptide CCCCC is calculated to be 2.20x1 0" 10 . 

The smaller the value, the greater the particularity of the oligopeptide. The 
oligopeptide that has the highest particularity among those having an amino acid 
sequence length of 5 residues is oligopeptide CCCCC, but this oligopeptide does not 
occur in any of the proteins encoded by the genome of Escherichia coli. In contrast, the 
oligopeptide that has the lowest particularity is oligopeptide LLLLL. 

When a protein or polypeptide (A) is decomposed into oligopeptides in step 1 , 
the longer the oligopeptide, the greater the particularity of the oligopeptide. 

Example 3 

In step 4 of the above method for predicting protein-protein interactions, as an 
index for predicting the interaction of proteins or polypeptides, the result of the local 
alignment is used. Here is mentioned an example in which scores of the alignment of a 
partial sequence by Gotoh's method (Gotoh, O., Pattern matching of biological sequences 
with limited storage, Comput. Appl. Biosci. 3, 17-20, 1987) are used. In the following 
examples, when the score was 25.0 or higher, it was judged that the partial amino acid 
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sequences are aligned (homologous) between a protein or polypeptide (A) and another 
protein or polypeptide (B). 

The "m" amino acid partial sequences are premised to be aligned between a 
protein or polypeptide (A) and another protein or polypeptide (B) with their scores being 
Si (1 DiDm) and the amino acid length of protein or polypeptide (B) being LB. The index 
for predicting the interaction of protein or polypeptide (A) with protein or polypeptide 
(B) calculated from the result of local alignment is defined as the sum (Si)/LB. It is 
predicted that the higher the index, the stronger the interaction. 

Example 4 

Prediction of interaction of VTII with Bcl-2 

Verotoxin 2 (VTII) of Escherichia coli 0157:H7 causes food poisoning and 
renal damage in human beings, but the mechanism of action is not well-known (Sandvig, 
K. 5 et al., Exp. Med. Biol. 412, 225-232, 1997; Paton, JC, and Paton, AW. Clin. 
Microbiol. Rev. 11, 450-479, 1998). This protein is a toxic protein. Therefore, human 
proteins relating to cell death serve as candidates of proteins interacting with this protein. 
Thus, human proteins that may interact with this protein were searched, specifically for 
human proteins relating to cell death, in protein database SWISS-PROT version 35, and 
an example is given below showing that they actually interact with each other. 

Among oligopeptides having an amino acid sequence length of 5 residues of 
verotoxin 2, those found to be contained in a human protein relating to cell death were the 
following four, i.e., LCLLL, QRVAA, EFSGN, and NWGRI, in SWISS-PROT version 
35 (see Fig. 6, where the human proteins are shown by using protein IDs of 
SWISS-PROT version 35). Values of particularity for these oligopeptides were 
calculated from the amino acid frequencies in all of the proteins encoded by the genome 
of Escherichia coli shown in Figs. 4a-b. i.e., the particularity of LCLLL was 
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1500L03xlO" 10 ; that of QRVAA was 15584.55xl(T 10 ; thatofEFSGN was 3801.65xl0" 10 ; 
that of NWGRI was 1 479.85* 10" 10 . It was found that NWGRI has the highest 
particularity among these four oligopeptides. 

Oligopeptide NWGRI comprises a portion of verotoxin 2 and each of three 
human proteins, i.e., Bcl-2, Bcl-xL 5 and MCL-1. Local alignment between verotoxin 2 
(VTII) and each of Bcl-2, Bcl-xL, and MCL-1 revealed partial homology in their amino 
acid sequence, as shown in Figs. 7, 8, and 9. Then, the sum of the scores of the local 
alignment was divided by the length of each protein to give index as described in 
Example 3, and shown below. 

Bcl-2 (30.0 + 27.0 + 25.0) / 239 = 0.343 
Bcl-xL (30.0 + 29.0 + 27.0) / 233 = 0.369 
MCL-1 (34.0 + 30.0 + 28.0 + 26.0) / 350 = 0.337 
Among these three proteins, Bcl-2 and Bcl-xL constitutes the same family. 
Based on the index calculated from the local alignment by the above method, the 
prediction is that Bcl-2 and Bcl-xL have higher interaction with verotoxin 2 among Bcl-2, 
Bcl-xL and MCL-1. 

Verotoxin 1 (VTI) is one of the verotoxins produced by Escherichia coli 
0157:H7, and is an isoform of verotoxin 2. The toxicity of verotoxin 1 is weaker than 
that of verotoxin 2, with the former being about one fiftieth the latter (Tesh, VL., et al., 
1993, Infect. Immun. 61, 3392-3402). In protein database SWISS-PROT version 35, a 
human protein that contains an oligopeptide having an amino acid length of 5 residues 
that comprises a portion of verotoxin 1 and is related to cell death is P2X1_HUMAN, the 
oligopeptide being SSTLG However, the particularity of oligopeptide SSTLQ which is 
calculated to be 14385.63 xlO' 10 from the amino acid frequencies in all of the proteins 
encoded by the genome of Escherichia coli shown in Figs. 4a-b, is lower than that of 
NWGRI, by about one tenth. 
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In verotoxin 1, the oligopeptide NWGRL that corresponds to oligopeptide 
NWGRI having an amino acid length of 5 residues in verotoxin 2 reveals a particularity 
of 2622.30xl0" 10 , calculated from Figs. 4a-b, that is lower than that of NWGRI. Both 
Bcl-2 and Bcl-xL contain oligopeptide NWGR having an amino acid sequence length of 4 
residues that comprises a portion of verotoxin 1 . Comparison between the particularity of 
NWGRI and that of NWGRL permits prediction that both Bcl-2 and Bcl-xL interact more 
strongly with verotoxin 2 than with verotoxin 1 . In addition, the indexes obtained by the 
calculation from the result (Fig. 10) of the local alignment between verotoxin 1 and Bcl-2 
or Bcl-xL are (27.0 + 26.0)/239 = 0.222 and 26.0/233 = 0.112, respectively (there is no 
homologous amino acid partial sequence other than the NWGR part). Consequently, it is 
predicted that the interaction of verotoxin 1 with Bcl-2 or Bcl-xL is considerably weaker 
than the interaction of verotoxin 2 with Bcl-2 or Bcl-xL. 

Example 5 

Experimental confirmation of prediction of interaction of VTII with Bcl-2 
In Example 4, the reliability of prediction that verotoxin 2 interacts with human 
Bcl-2 or Bcl-xL was predicted to be high. Based on the result of this prediction, it was 
experimentally confirmed that verotoxin 2 actually interacts with Bcl-2 (Fig. 1 la and Fig. 
lib (right)). Specifically, human hepatic cancer cell HepG2 (essentially not expressing 
the Bcl-2 gene) and B10 cells prepared by transducing a Bcl-2-expressing vector into 
HepG2 so as to express Bcl-2, were treated with verotoxin 2 (VTII), and then 
co-immunoprecipitation was conventionally carried out using anti-Bcl-2 antibody (Bcl-2 
IPs) and anti-VTII antibody (VTII IPs). 

Fig. 11a (left) illustrates the result of the western blotting analysis using 
anti-Bel -2 antibody after co-immunoprecipitation; Fig. 11a (right) illustrates the result of 
the western blotting analysis using anti-VTII antibody. It was confirmed from these 
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results that a VTII/Bcl-2 complex was co-immunoprecipitated in the BIO cells, i.e., these 
two proteins interact with each other. Moreover, BIO cells were treated with verotoxin 1 
(VTI) or verotoxin 2 (VTII) to examine in which subcellular fraction these proteins were 
detected using anti-VTI antibody and anti-VTII antibody. Bcl-2 in mitochondria plays a 
very important role in cell death. Verotoxin 2 (VTII) was detected also in a mitochondria 

+ 

fraction (Fig. lib (right)). 

On the other hand, verotoxin 1 was not detected in the mitochondria fraction. 
Thus, it was proved experimentally that verotoxin 1 does not have a strong interaction 
with mitochondria Bcl-2. The result is shown in Fig. lib (left). 

Example 6 

Fig. 12 illustrates an example wherein the full-length amino acid sequences of 
verotoxin 2 (VTII) and Bcl-2 were displayed so as to show the locations of the partial 
sequences aligned in the full-length sequences. 

Example 7 

The stereo structure of Bcl-xL is known, with the structure being registered in 
PDB, that is a protein stereo structure database. Based on the result of the local alignment 
of Fig. 8, partial amino acid sequences homologous to those of verotoxin 2 in the stereo 
structure of Bcl-xL are shown with bold lines in Fig. 13. 

Example 8 

Verotoxin 2 is believed to cleave a part of ribosomal RNA so as to stop protein 
synthesis, thereby exerting its toxicity. The stereo structure of protein 'ricin 5 that cleaves 
a part of ribosomal RNA is registered in PDB, that is a protein stereo structure database. 
Based on the structure, homology modeling of verotoxin 2 was carried out. Based on the 
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result of the local alignment of Fig. 8, the amino acid partial sequences homologous to 
those of Bcl-xL is shown in the stereo structure model with bold lines in Fig. 14. 

Example 9 

Suppression by NWGRI of cell death induction by VTII 
Next, it was experimentally confirmed that oligopeptide NWGRI (SEQ ID NO: 
1), which was found in Example 4 and comprises a portion of VTII and Bcl-2, can control 
the interaction of VTII with Bcl-2. First of all, the complex formation was examined 
using an extract of the Bcl-2-expressing BIO cells used in Example 5 and biotinylated 
VTII in the presence of oligopeptide, and then analyzed by Far Western blotting analysis. 
Oligopeptide NWGRI interrupted the complex formation of VTII and Bcl-2 in a dose 
dependent manner. 

In addition, BIO cells were pretreated with oligopeptide NWGRI at 0, 10, 50, 
100 nM and were treated with VTII at 10 ng/ml for 24 hr, and the induction of cell death 
by apoptosis was assayed. A total of about 5,000 nuclei was dyed with Hoechst 33342/PI 
(Propidium iodide) according to the conventional method, and the ratio of nuclei that 
showed apoptosis is shown in Fig. 1 5. As shown in the figure, about 85% of cells caused 
apoptotic cell death by the treatment with only VTII, while the induction of apoptotic cell 
death was suppressed by pretreatment with oligopeptide NWGRI in a dose dependent 
manner. Thus, it was confirmed that oligopeptide NWGRI, which comprises a portion of 
VTII and Bcl-2, interrupts the interaction of VTII with Bcl-2 so as to inhibit the complex 
formation of these proteins and suppresses cell death induction by VTII thereupon. 



Example 10 

CD4/gpl20HIV-l 
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Human AIDS virus HIV-1 infects helper T cells. An important first step to 
infecting these cells is that protein gpl20 on the viral surface binds to surface protein 
CD4 of helper T cells. In this example, it was examined if the binding of gpl20 and CD4 
can be predicted by the above prediction method. 

Protein CD4 was decomposed into oligopeptides having an amino acid sequence 
length of 5 resides, and proteins having the amino acid sequence of the oligopeptide 
derived from CD4 were serially searched in a protein database, and gpl20 was extracted 
as a protein that contains oligopeptide SLWDQ (Fig. 16a). Oligopeptide SLWDQ exists 
only in protein CD4 as a human protein in SWISS-PROT version 35, i.e., the frequency in 
human proteins is 1 and the particularity is very high. Moreover, besides this 
oligopeptide, a locally homologous region exists (Fig. 16b). It is known that amino acid 
residue arginine (Arg) next to oligopeptide SLWDQ in the N-terminal side and 
67-SFLTKGP-73 play important roles when CD4 binds to gpl20 (Kwong, PD., et al., 
Nature, vol. 398, 648-659, 1998). It is also known that a few amino acid residues next to 
the homologous region (289-KTIIVQLNETVKINCIRPNNKT-3 1 0) shown in Fig. 16b in 
the N-terminal side is one of the regions playing an important role when CD4 is 
recognized by gpl20 (Kwong, PD., et al., Nature, vol. 398, 648-659, 1998). Therefore, 
even if the binding between gpl20 and CD4 is not known, it can be predicted by the 
above prediction method. 

Example 11 

CED-4/MAC- 1 

Nematode Caenorhabditis elegans is the first multicellular organism whose 
entire genome information was elucidated. One example concerning C elegans is 
described here. Protein CED-4 plays a central role in the control of programmed cell 
death. MAC-1 was found to be a protein that binds to CED-4 and suppresses cell death 
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(Wu et ah, Development, vol. 126, 9, 2021-2031, 1999). Therefore, oligopeptides that 
comprise a portion of these two proteins were examined so as to verify the present 
invention, although the binding between MAC-1 and CED-4 is known. As a result, it was 
found that MAC-1 and CED-4 contain the same oligopeptide FPSVE having an amino 
acid sequence length of 5 residues, and the present invention was verified. The index of 
this oligopeptide, calculated from a frequency of amino acids in the genome of C elegans, 
was 5.436. Moreover, as illustrated in Fig. 17, there are many homologous regions 
between these two proteins, whereby the binding of these proteins was strongly suggested 
(top sequence, CED-4; bottom sequence, MAC-1). 

Example 12 

APP/BASE 

APP (amyloid precursor protein), which is one of the proteins causing 
Alzheimer's disease, gives rise to amyloid upon being cleaved at two sites. An enzyme 
(BASE, bata secretase) that cleaves the site on the amino terminal side of the two 
cleavage sites was recently discovered (VASSAR et ah, Science, 286(5440), 735-741, 
1999). Cleavage of APP by BASE indicates the presence of the interaction of these two 
proteins. To verify the present invention, oligopeptides that comprise a portion of these 
two proteins were examined. APP and BASE have homologous oligopeptides WYFDV 
and WY YEV having an amino acid sequence length of 5 residues that comprise a portion 
of each of them. The oligopeptide WYFDV exists only in protein APP as a human 
protein in SWISS-PROT version 35. A human protein comprising WYYEV is not 
registered yet. Both oligopeptides have high particularity. This result verified the present 
invention. Fig. 1 8 illustrates the regions homologous between the two proteins (top 
sequence, APP; bottom sequence, BASE). 
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Example 13 

Furin and von Willebrand factor 
Furin is an intracellular serine protease, and is related to the secretion system 
pathway, such as von Willebrand factor (VWF), albumin, and complement C3. An 
example of the interaction of furin with VWF is mentioned here. VWF is cleaved from a 
precursor protein by furin to act as a mature protein. Cleavage of the VWF precursor 
protein by furin requires the interaction of these two proteins. Moreover, fiirin per se 
becomes a mature protein from a precursor protein of furin by being cleaved to act as a 
protease. Therefore, to verify the present invention, an oligopeptide that comprises a 
portion of furin precursor protein and VWF precursor protein was examined. The two 
proteins comprising the same oligopeptide HCPPG, at positions 613-617 of furin 
precursor protein and at positions 1 176-1 180 of VWF precursor protein (Fig. 19a). Both 
locations are within the regions of the mature proteins. The oligopeptide HCPPG 
comprises a portion of only furin precursor protein and VWF precursor protein as human 
proteins in SWISS-PROT version 35, and has high particularity from the viewpoint of 
frequency. VWF precursor protein is cleaved by furin at the site between the 763rd amino 
acid residue and the 764th amino acid residue. Local alignment between furin precursor 
protein and VWF precursor protein reveals that the region near the site of VWF precursor 
protein cleaved by furin has a partial region homologous to fiirin precursor protein (Fig. 
19b). Thus, even if a novel protein was presumed to be a protease by the motif of the 
active part, and a counterpart protein as well as the cleavage site in the counter protein 
was not known, the present invention permits predicting the counterpart protein, as well 
as the cleavage site in the counter protein. 
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APP alpha is a peptide formed by cleavage of amyloid precursor protein (APP) 
at a site different from the two cleavage sites to form amyloid. It was recently found that 
PC7 (proprotein convertase subtilisin/kexin type 7) is involved in cleavage for forming 
APP alpha (Lopez-Perez E et al., J. Neurochem., vol. 73, 5, 2056-2062, 1999). 
Examination of the oligopeptide that comprises a portion of the two proteins APP and 
PC7 revealed that APP and PC7 have homologous oligopeptides DSDPSG and DSDPNG 
having an amino acid sequence length of 6 residues that comprise a portion of both of 
them. The oligopeptide DSDPSG exists only in protein APP as a human protein in 
SWISS-PROT version 35. A human protein comprising DSDPNG is not registered yet. 
Both oligopeptides have very high particularity. This result verified the present invention. 
Fig. 20 illustrates regions homologous between the two proteins. 

Between K and L of 687-KLVFFAEDVGS-697 of APP in Fig. 20 is the cleavage 
site to form APP alpha, and Fig. 20 illustrates that a partial sequence 
(3 59-RMPF YAEEC AS-3 69) homologous to this exists in PC7. Namely, this example 
shows that the present invention permits predicting a protein involved in cleaving another 
protein. 

INDUSTRIAL APPLICBILITY 

As described above in detail, the present invention permits predicting, by using a 
protein database, a counterpart protein that interacts with a protein having an unknown 
function that is obtained by genome analysis or cDNA analysis or a protein having a 
known function. Namely, the protein-protein interaction in one organism whose genome 
information was elucidated can be predicted on a computer using a protein database based 
on genome analysis and cDNA analysis that have been recently enhanced. If the 
prediction on a computer becomes possible, information concerning proteins that were 
predicted to interact with each other based on the prediction on a computer can be easily 
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obtained without adopting a risky technique wherein the result depends on a cDNA 
library used, such as the two-hybrid method. The prediction became possible, so that it 
becomes possible to easily predict the sequence of an oligopeptide involved in the 
interaction, and to design a novel compound capable of controlling protein-protein 
interactions based on the information. The present invention makes elucidating 
protein-protein interactions efficient, and can be widely utilized in various fields 
including biochemistry, molecular biology, pharmaceutical development, agriculture, and 
biotechnology. Especially, in the development of pharmaceuticals, the present invention 
permits predicting the mechanism of disease that has not so far been known, and gives a 
possibility of creating novel pharmaceuticals. 
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What is claimed is: 

1 . A method for predicting a protein or polypeptide (B) that interacts with a specific 
protein or polypeptide (A), wherein the method is characterized by comprising: 

1) decomposing the amino acid sequence of protein or polypeptide (A) into a series of 
oligopeptides having a pre-determined length as sequence information; 

2) searching, within a database of protein or polypeptide amino acid sequences, for a 
protein or polypeptide (C) comprising an amino acid sequence for each member of 
the series or for a protein or polypeptide (D) comprising an amino acid sequence 
homologous to an amino acid sequence for each member of the series; 

3) carrying out local amino acid sequence alignment between said protein or 
polypeptide (A) and the detected protein or polypeptide (C) or detected protein or 
polypeptide (D); and 

4) predicting whether the detected protein or polypeptide (C) and/or protein or 
polypeptide (D) is a protein or polypeptide (B) that interacts with the protein or 
polypeptide (A) based on the results of the local amino acid sequence alignment and 
a value calculated from a frequency of amino acids and/or a frequency of said 
oligopeptides in said amino acid sequence database. 

2. The method according to claim 1, wherein the oligopeptide is 4-15 amino acids in 
length. 

3. A recording medium carrying a program to predict a protein or polypeptide (B) that 
interacts with a specific protein or polypeptide (A), wherein the recording medium is 
characterized by comprising at least the following means a)~f): 

a) a means for inputting amino acid sequence information of the protein or 
polypeptide (A) and storing the information; 
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b) a means for decomposing said information into a series of oligopeptides having a 
pre-determined length as sequence information, and a means for storing the sequence 
information consequently obtained; 

c) a means for storing an input protein database; 

d) a means for accessing the stored protein database and detecting a protein or 
polypeptide (C) having an amino acid sequence of said oligopeptide or a protein or 
polypeptide (D) having an amino acid sequence homologous to the amino acid 
sequence of said oligopeptide, and a means for storing and calculating a detected 
result; 

e) a means for carrying out local alignment between said protein or polypeptide (A) 
and the detected protein or polypeptide (C) or protein or polypeptide (D), and a means 
for storing and calculating a result; and 

f) a means for obtaining a resultant value of a frequency of an amino acid and/or a 
frequency of said oligopeptide from a protein database, followed by showing an index 
for predicting protein-protein interactions from the resultant value and a resultant 
value of said local alignment, and a means for storing and displaying the result and 
consequently detecting protein or polypeptide (B) which interacts with the protein or 
polypeptide (A). 

4. A recording medium characterized by comprising at least the following means in 
addition to the means according to claim 3 : 

g) a means for ranking strength of protein-protein interactions among detected 
proteins or polypeptides (B) based on the indexes calculated from a resultant value of 
local alignment and a resultant value of a frequency of an amino acid and/or a 
frequency of an oligopeptide in a protein database in the case that more than one 
protein or polypeptide (B) exist that are detected, and a means for storing and 
displaying the result. 
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5. A recording medium characterized by comprising at least the following means in 
addition to the means according to claim 3 or 4: 

h) a means for displaying full-length of amino acid sequences of said protein or 
polypeptide (A) and said protein or polypeptide (B) that is detected, followed by 
indicating a location of partial sequence to be aligned in the full-length sequence in 
the case that amino acid partial sequences are aligned by local alignment between the 
protein or polypeptide (A) and the protein or polypeptide (B). 

6. A recording medium characterized by comprising at least the following means in 
addition to the means according to claim 3, 4 or 5: 

i) a means for calculating a stereo structure model in the case that a stereo structure of 
said protein or polypeptide (A) or said protein or polypeptide (B) that is detected is 
known or in the case that homology modeling enables to make a stereo structure 
model, followed by displaying the structure of the amino acid partial sequences that 
are aligned by local alignment between the protein or polypeptide (A) and the protein 
or polypeptide (B) on the stereo structure. 

7. A recording medium characterized by comprising at least the following means in 
addition to the means according to claim 3, 4, 5 or 6: 

j) a means of classifying and storing proteins in a protein database to narrow a 
searching area. 

8. A recording medium characterized by comprising at least the following means in 
addition to the means according to claim 3, 4, 5, 6 or 7: 

k) a means for serially inputting each protein in a protein database as said protein or 
polypeptide (A). 

9. A recording medium characterized by comprising at least the following means in 
addition to the means according to claim 3, 4, 5, 6, 7 or 8: 

1) a means for storing a genome database. 
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10. A device for predicting protein-protein interactions which comprises the means that 
are carried by the recording medium according to claim 3, 4, 5 5 6, 7, 8, or 9. 

1 1 . A method for specifying proteins or polypeptides that interact with each other, which 
comprises identifying a protein or polypeptide (B) that is predicted to interact with a 
specific protein or polypeptide (A) using the method according to claim 1 or 2, and 
then experimentally confirming the presence of the interaction between the protein or 
polypeptide (A) and the protein or polypeptide (B). 

12. A method for specifying proteins or polypeptides that interact with each other, which 
comprises identifying a protein or polypeptide (B) that is predicted to interact with a 
specific protein or polypeptide (A) using the device according to claim 1 0 5 and then 
experimentally confirming the presence of the interaction between the protein or 
polypeptide (A) and the protein or polypeptide (B). 

13. A protein or polypeptide that is specified by the method according to claim 1 1 or 12. 

14. A method of screening for a compound that controls the interaction of a specific 
protein or polypeptide (A) with a protein or polypeptide (B), wherein the method 
utilizes the method according to claim 1 or 2. 

15. A method of screening for a compound that controls the interaction of a specific 
protein or polypeptide (A) with a protein or polypeptide (B), wherein the method uses 
the device according to claim 10. 

16. A novel compound obtained by the screening method according to claim 14 or 15. 

17. A novel compound capable of controlling the interaction of a specific protein or 
polypeptide (A) with a protein or polypeptide (B) 3 which is obtained by drug design 
based on information of a compound obtained by the screening method according to 
claim 14 or 15. 

18. An oligopeptide comprising the amino acid sequence of SEQ ID No: 1 5 which is 
capable of controlling the interaction of verotoxin 2 (VTII) with Bcl-2. 
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19. An agent against cell death comprising an oligopeptide comprising the amino acid 
sequence of SEQ ID NO: 1. 

20. An oligopeptide comprising an amino acid sequence homologous to the amino acid 
sequence of SEQ ID No: 1 5 which is capable of controlling the interaction of VTII 
with Bcl-2. 

21. A polypeptide comprising the amino acid sequence of the oligopeptide according to 
claim 18 or 20, which is capable of controlling the interaction of VTII with Bcl-2. 

22. A method of screening for a compound capable of controlling the interaction of VTII 
with Bcl-2 5 wherein the method utilizes the oligopeptide according to claim 18 or 20 
and/or the polypeptide according to claim 21 . 

23. A method for determining the nucleotide sequence of an oligonucleotide coding an 
oligopeptide which is involved in the interaction of a specific protein or polypeptide 
(A) with a protein or polypeptide (B) 5 wherein the method uses the prediction method 
according to claim 1 or 2 or the prediction device according to claim 10. 

24. A series of combinations of human proteins that are predicted to interact with each 
other, obtained by the method according to claim 1 or 2 or the device according to 
claim 10. 

25. A method for selecting a combination of proteins which interact with each other, 
wherein said interaction is related to a disease, wherein the method comprises 
selecting the combination based on the information of a known protein that can be 
related to the disease from the series of combinations according to claim 24. 

26. A series of combinations of proteins which interact with each other, wherein said 
interaction is related to a disease, wherein each member of said series is selected 
according to the method of claim 25. 
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27. A method of screening for a compound which controls the interaction of a 
combination of proteins and/or two proteins selected from the series of combinations 
according to claim 26. 

28. A compound obtained by the method according to claim 27. 

29. A method for predicting a processing site of a protein comprising predicting the 
interaction of a specific protein with an enzyme cleaving said protein using the 
method according to claim 1 or 2 or the device according to claim 10. 

30. A polypeptide comprising an amino acid sequence that is predicted to contain a 
protein-processing site obtained by the method according to claim 29 and/or to 
contain a partial sequence homologous to the processing site. 
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ABSTRACT 

The present invention relates a method for predicting a protein or 
polypeptide (B) that interacts with a specific protein or polypeptide (A), wherein the 
method is characterized by comprising: 

1) decomposing the amino acid sequence of protein or polypeptide (A) into a series of 
oligopeptides having a pre-determined length as sequence information; 

2) searching, within a database of protein or polypeptide amino acid sequences, for a 
protein or polypeptide (C) comprising an amino acid sequence for each member of 
the series or for a protein or polypeptide (D) comprising an amino acid sequence 
homologous to an amino acid sequence for each member of the series; 

3) carrying out local amino acid sequence alignment between said protein or 
polypeptide (A) and the detected protein or polypeptide (C) or detected protein or 
polypeptide (D); and 

4) predicting whether the detected protein or polypeptide (C) and/or protein or 
polypeptide (D) is a protein or polypeptide (B) that interacts with the protein or 
polypeptide (A) based on the results of the local amino acid sequence alignment and 
a value calculated from a frequency of amino acids and/or a frequency of said 
oligopeptides in said amino acid sequence database; and to 

a recording medium for carrying but the above method, a device comprising the 
recording medium, and proteins obtained thereby. 
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V© (57) Abstract: A method of anticipating an interaction between proteins characterized by comprising: (1) digesting the amino acid 
sequence of a protein A to give oligopeptides of certain length; (2) searching for a protein C having the above-described oligopeptides 
or a protein D having oligopeptides homologous with the above-described oligopeptides from a protein data base; (3) performing 

O local alignment between the above-described protein A and the protein C or D thus detected; and (4) anticipating that the detected 
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a protein B having an interaction with the protein A by using a value calculated from the amino acid frequencies or oligopeptide 
frequencies based on the results of the local alignment and the protein data base; a recording medium carrying the above anticipation 
program; an anticipation apparatus carrying this recording medium; and proteins obtained therefrom. 
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(b) zLvm i^fesi^ottict y u^y?- ptzmmmmt it 

(d) ^iB^^tifcsew^-^^-^tr^-b^ If IB a- U =f ^ 7* 

?t®7*jmmnzn^tzm&mt>L<&#y^7>3-\! <c) & s x 

ttarE*y=l p ^7'^F07$yRE3»j2:ffieft^§^eE5ijsSro*: 

**880ioo»tttt x tufa (a) ~ (f ) (D^mzmnTzm 
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4 
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TIB (g) ~ (1) ©3^i&»£>»tt*i3£l>& < t %— o<D^m% 

(h) D-i7;i/77^^>FCJ:D, HufBA fufB&iB £ ft£ B tom 

(i) tuIBA L < &Mm&ft$tltzB<D&fcMm&Wi,!®(Dm'&, 
^^D^-^r 'J > ?T'iLi&m&^y' JltttttlZWj^. ^©Afr^jg^E 

tiiB©piT77^^>h^n§7^ y ^ # ie #j © a 3i & ^ -f s 

fry 

*mwcD i -d commit, mm®m&mfc&&wt%^WL*m7Ltzw.& 
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U K (B) "OV>T?^gfcfi&fc: AfcB fcOlSKlfflSffcjg 

M^ftfcgSffxtetf y K (B) tOSGSMSfl^^SI^?- 

^ & fg * ^ *• § <t ® % m m t z> is m x $> % 0 
*^bj© i o©jk&b: n #sgBj§£ «t o t# h titc ^nmm 2 <vt 

oTVTHiiBc 1-2 hOffil^SIIf^ii^ftS^ 1 ; =f^ 
7"^- K s X fci: £ ti 0 ;t y K<ZH"*T *l £ 0^79- K T? 

fc^TVTII^Bc 1-2 i:©ffl5#ffi*IBlE-r«*lll*^-rS3Hy ^ 

3=- K^^JitSx VTEJiBcl-2 fcOffiSfe/UfcHg-taftfifc^ 
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6 

t s & ® © m so -fi s T & s o 

*^BJ© 1 -o©^*£&, ]iyE©^«a#&;££fyfa©^«fl^S£ <fc D * 

t m # -r s qj titt # & -a m & ft © fit « * * £ l t & fr a . 
m m £ w -r § § £ w *@ s f£ m & m t % s a r © £ * t> * © s & ^ 
?£-?fc3o ^^.t. #ignj3© i o©s^ii> £©#&-?!# 

/ x & - -D <d m & m © m s # /§ * m &E -r s *b % * s &i ? % # s "e & 

#2gBJ§© 1 o©»S«fci\ t&f3©^$J#&X£iiuie©M^S£>EV>T> 

££>t^ ^mmo i commit, mmm^Moy^-^^ ypn^^ 
m t z % & -c » s> n s a k d -t ^> > v/x & @ ^ d -t ^ > 
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7 

■ 

mi <d ( a ) ti^Dsi 2 s*©7 $ >> *>© 2 o as§, ( b ) 

£ ( c) ^ *i ^ ti, M 2 0giSfr£>ft3;r$ y KiBJUSSfcT'D 

±xm^i^mt Lxfrmztixnb titer s y k^s®*: 6ffl©t u =f 

D ^7 A ± T IB #J tit $8 h bT#J§?£;ftTl§ £>*vfc:7^ ;iI5fl©*'J 

12^73- y y -3^73- f©7^ y ^iB^J^^-T^ t h ©S 

•.. 0 3 fcfc x ^Dll 2 h ©/? 7 K b ^ 1) >g«g#3r^--te* 2 ( AR 
K 2) LTV^sj- U 3^73- ^D-A;V77^f^> h t) 

0 4© (a) A£:fcfr3#Ti* ^iolg^, (b) 

Efc it ^ £ ?5 o 

0 7 «\ ^D||2 ( V T II ) kBcl-2 t&&mhX^%* »; rf 

08ii, ^Dt|2 (VTH) fcBcl-xLt^tl/T^S^iJ 

0 9 lis ^D«^2 (VTH) tMCL-U#ftf bt^StU^ 
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0 10 t£ s ^Dtll (VTI) Bcl-2 (a) XfcfcB c 1 - 

xL (b) J:iS^tlt^8*lJ 3^y^- FSo-A^77-f^ > h t 

0 1 18, ^D«H1 (VTI) i:Bc 1 - 2 ftffiSff^ R&t C(b) 
©£0X ^D||2 (VTII) ^Be l-2i±fflSf£#-fSC:i: C(a) 
(b) ©£0) §HepG21HSi;B 1 0 «MS& JB ^TM&fcj fc: 
^^bfe^^^^-Tit^^ijET^^o Its Bcl-2 IPs^yt 
VTII I P s li, ?n?htjiB c 1 - 2 mi&RVtfiV T HirSte^/g^ 
T^^tfel^bfe^ ££^1~o (a) CDifeE&iniB c 1 - 2ifc#-%JB 

^te7X^^>7ny Mfcti© (Bcl-2 WBX (a) 
•£ni V T II inl#£ $M^T!)i^^>7d y h (VTII WB) bfc&CDT 

(b) &B 1 OHHt^Dii 1 (VTI) (£0) 
D **2 ( V T II ) (£0) £f£/B£-£ N jffflj&ft©i:©7 7*£'3>3&»£> 

§ m ^ t its b tz %m & m r«5*» m T* tb % o 

m 1 2 & x ^Df|2 (VTII) tBcl-2t0D-A;i/77^^ 
@1 3B S Bcl-xLOiftil±-e^DSl2 ( V T E ) 

0 1 5 £ N ^Dg| 2 (VT H) £B c 1 - 2 bTV><S;fr U 

0 1 6 (i s t hA;w^-T|j|i®ieiCD 4 tE I V 1 »7 
illfilf P 1 2 0 iuP^btl^tfU 3*17 3- ^D-AJP77 
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9 

##m-rz*v Kifsto (b)y;CD4i:gp 1 2 ormmm 

El 1 7 B\ S4©ii%SSSfiiT^^ CED-4hCED-4-t 

01 8 «\ 7 F7l/^-t-7DT^f > (APP) tZti%® 

WitZ>BmT'%>% BASE t&&mLT^ 'J3>7f K * D— 

0 1 9 tt N 7 a.- y >7l/A-^-SfiH (furin-pre) 
7 * > tr;i/^^ > h @^7*l/ — SfiK (VWF-pr-e) 

^ t T ^ S * U zf ^ t*^- F ^ n - A ^/ 7 7 -f ^ > h «t D » fc em * 5^ 

(b) itmw&m-zmmtttmm&cDm^mi&vT ;wtmm*mTo 

" 02 0 B\ 7 Yzr^-b — V — ZTv^-f y (APP) t Z-fl(D7 

c failing 
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.1 8a«#e 

#b<gft^-r^ 0 # T © f¥ Iffl & R W tt ■£ >)> i£BJ§©fc«>©&©£ 

«£*lT^fcl^|Ei) % BJ © JR 5 t * ^ T 31#©*n fiH £ # 

t^ffcii) ^jlHii^^n?.ScJ*^^Oo # M -e & iii lit g& 
^©i^O^S^tl^nt V^£o ^© J; 5 &3lM£*lT^3^#J©;5■ 
&£gf^-r£ : PJ^T © g » * *i & £5i$i-3;r*:fc£t) 

S^-y zr^r^ £ 5 fa 6 © # *J £ # o T ^ £ fc#x. 
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Po £ S *t <Z> — & E 3BJ £ 3fe T (77^>h ; alignment) it 
#££*f\ 93-10 4 N 1 9 9 6 ) 0 £0IB#JT5 J* > Mr fci^D — 
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12 

fthZo ysij-^vtryu??* >¥(Dmm t*^V^fc Smith-W 
atermang (Smith, TF and Waterman, M 
S, J. Mol. Biol. 147, 195-197, 1981) Hct 

t> m&i<Dffl.fr&t>&(Dm&fc t •^v^TCDW^fflp^^#^ ^<ow.x •& 

<Z> lit! t£ £ fF ffi T t 5 o ifcgt bfcg^HF^T*©^^ Zti 

* n rr ic mi £ ^ it, ^#I3^J©0^fc>-fr©«}i4b£fT? £ hkiJ: Ds 
n—%)\,7^4 * > h ^fjt>tL^ Gotoh, 0., Patt 

ern matching of biological seque 
nces with limited storage, Comput. 
Appl. Biosci. 3, 17-20, 1987 ) c £ © n — Jj)V7 

5 9 o tltS^IIO^DtltB c i- 2<om\±, I§i^t-9 

T^uj f*7l/ij-it-Sai^7t>lf;i/77> h@? [Von W 
illebrand Factor (VWF)] 7 U % -V-mSM®? 

Ztl. Mjt±X &MM<D7 ^ ;mmw$ : &&Lx^2> t%%.2> z t&x 

m is © & s t ^ ^ v v x m tb £ ft tz m & w rs « s f£ ^ m is m t <t n « ^ 
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?S(7 (1) (4) S**fifttbT^5. 

^r<;7 (3) n * ti 6 S 6 JM T © gfl # M © .;}§ £ 

n-*^75^f/i/h.tii)fffiu (4) £fc^£;h, 

iv 7 ( 1 ) : 

£#£©*>;l^ff&L<y;*-y^:7^h- (A) CoHTs *©7>*>> 
0J*«Bltts ^i|0 157 :H70^D||2 ( V T II ) <DT ^ 
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14 



U ri ^-7*^- K © g £ # jl ^ 15 £:\ ?(D* V rf K©#^Si±. flffi 

( 2 ) : 

10 u©7ry7Tli7797 (1) l:#J§?bTl§£*ifc:3- Urf^^^ K 
(DT $ y ^I2?>J£&^ tzm&Mb b < U ^73- Y (C) £s X&£ 

7"^ h* (D) b < ^73- \*(D7^ J — 

is m^^'J rf^r^ Ft* <#St5i^*fe§ bs lioi^- 
& -5 0 

#I?Lfc^ E 2 fcfc. *B§® 0 1 5 7 : H 7 ©^dSSI 2 ( VT II) ©7 
^ ^ MM$$sfr b 1 31i^t:o^t^g^ ftfc 9 ii07^ 7 ^ 5 *@© 

20 -^-^SWISS-PROT version 3 5 ffttibfe 

09;t«x i202SgCSbfe^'j3^7'9 : FKC ILF^^o 
kF0l61li)3 7Fl/7-'J >S^^^^--fe* 2 [BE TA-ADRE 
25 NERGIC RECEPTOR KINASE 2(ARK2 HU 

man)] t-&6 ^ h^^m^m^ bx#e> ti&o 

. Xrf 7 (.3): 
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S3fe, <D#^2 (VTII) ^7Pl/t'J>fgMt 
— S2 (ARK2) h0D-*jV77^y>h©^§®itt^§ o 

X&A & b < b < fctD o£t3© 

D&A £;f§2#/g?-3^es$ b < fci: <; ^ 7" 3=- FBt^^i:©? SSli 

y ;AT-3- K$n5iS6Icfi-e©7^ C0 4© (a)} 

EtM$ti%ffi.m}t Ci4® (h)) ^©#^^0jf^{i 1 2 8 4. 

8 6x1 o- 10 hif-3f£ft> ^JH$#^Uo -XmmV J A* 

LLLLLT&tK ■fcOftgttO&SSfci: 1 3 6 3 4 4 . 3 4 xl0~ 10 
T&£o — #4$^t$#;^7>$ 7$SB?tlS# 5||©^.>J zli7f- 

h-ftCCCCCffef), ^©^M^©Ji^f± 2 . 2 0 8 xlO" 10 T'fe 
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tifc. VTHi:Bcl-2i:0ffiI^}i, Bcl-2^II^tfel 

Sl^&fci;, suffix ft £ %©-e&& < N lifcfefiB-e 

rft«> J: a ta* V dr^r? yrttrRoftRomfflfflt 

£B c 1-2 <Z)ffi2f£/B&&-?!fc£ t¥m$tifc*V FNWG 
RIfcfcx VTntiSIMSjfliSlU ^«%£!i£bT*!lffi-£t5o 
-5 gigs -£lsEII& if £ b T*Jtf # qTe&T -5 • 

B5 £Mf&ffl%^Ltz£v fc s 'J> ft < £ b SIT (D (a) ifi £ (f) 
ATJ^© (a) : 
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^-Urf^T*^ K^j^/fBft^® (b): 

IBIS^^ ( c ) : 

t^^XIBii^a (d) : 
IBtl^g (c) fcfBti b < fcfcstf U ^7*^ FCHt^T 

Sfif^Kli^r^ F (C) XtttfrIB;*- U 3^7? Y<DT K J M 

mwtftm-&7$ ; mum ■? a m * b < y ^ h- ( d ) 

D-*;i,:P5>f * > MS /e«^'« (e) : 

Mmmn/mm^m^m. (f): 

^ff ^*<D&mfr §&®^sfi^©^©fc#>©j£;fs£if#-r& 
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M&ft-fr/f3«^#g (g) : 

(h) : 

:£ft*l£frir/fi3fi^3^ ( i ) : 

£&3K#S/SBtt#g ( j ) : 

MfrXfi^m (k) : 
f3it#g ( 1 ) : 
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A±^g (a) £<fc t) A.* £ ft *| |fc 5fc *D © 



6Rt> l< ta^y^r*- p (a) a % *y r^r^ k#i?/e*# 

^Pfc#fl?£ft, K^y 3173- KliEttSft*. rofct, MA©a 

tj^ mmi££*) s w&mx.\±#vi7^ KfcWLTA7j£ftfc5=-;7 

^-*&I3t«7rs^© ( C ) g#A:fc^s (k) tit), £&a 

IS 12 ft £ ft ;t y zi^t"^ K©7$yii5!|fc , 3V^x ^ 

(c) fc*f LT&m/iati^© (d) ti!)**iJft$n, **y=r 
M^s/gBit^g (j) tzxt), ^ y 17 



(e) fci t) D-i;V7^ > h#&£ft, *©£Jil#EflS<*ftSc 

£&im/E®^*^g (f) t^t), (c)'cKtt£ftfc 

(Demote ®(DmM&m-nz tir fzmztiZo * it, 

aK*b<J±#U^r^F (B) £*§s;r £#-?iF3 0 *fc x tuIBB 



WO 01/67299 PCT/JP01/01846 

20 . 

£\ tf&^tis c: t &T£3o &m£ftfcB#a^&&BS£&. JUS 

tttt/iati3l^^g (g) CiD, ^ft £> B ©F^T's liufBA £*@Sf£ffl 
ti^tS^^tittS. :fc#:*3SfHr/fBtt£a^®! (i) 

io nsaj#&fc fc#-e£s 0 ztibo&m (a) ~ (i) jy. 

ffl^m (n). fil^7^^ (o) £££fif;ilfc&©-e&^Tfc .fc^o 
15 I6lXli^'M7'5 L K (A) J:IfiSXIi^'J^7"^h' (B) tvm 

flic mmm&mxittf u k (a) tiaixtt^J^^K 

SAj^B © ffl 5 f£ /B « ft £ ft 3 cfc 5 tulB^^fc: i D 
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1-2 t&&mTzmm^(Dm?m^ i tise©^- y 3^73- fnwg 

^WFtfeoT, VTD^Bcl-2 hcD^gSf^ffl^PS-r 

U rf^T*^ K&jRIM It V T HtBcl-2 £ © 

&> i^cieixii^ y^T*^ h* (a) tseixii* y ^t-^ h* (B) 

tufa ^^©seiXJi^'MT'^h' (A) J:Sfi 
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m 6 »©gaz* ^ #1 1 h -r -5 m&m(D 

^ i: t «fc *k &£fcH^rs3ms@i«:i:fl s ffi&^"f 3§6K©fc2^'£■ 
t 
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i - U > (fur in) £©;|SS#ffl 3 £ t&XZtzo "T^t) 

^tltSWISS-PROT version 35£ffl^fcfr\ 
1 ) 

m 1 tt. X?^7 ( 1 ) ©Hifcfclfc UTx ^110 15 7 : H 7 
D#fjt 2 ( V T H) 07 ^ J t>mM<0 2 CE 1 © (a)} 

© (b)D T*&3o 01® (c) ^D$H2 (VTO) ©7^ 

t>m%](D 2 0B^#£o^T7' ^ >> KE8J:E# 6 ffi<D* V 3i 
7"^ h*fc:#J8IL;fc &©T£3o 
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(MMffl 2 ) 

mmm ^wrsmMff^^m^moT,^ v 7 (4) tz&^z, 

V> £ o £ C t bTs El 4 jj* & ill T V» 

0V AT* h~£ft£:£g£Sf*T- Offing C0 4© (a)} fr&x * 
^S^&a iOM^Jt^A i&E4(D (b) ©idt&So 

»J 3^7?- F#a 1 a2a3a4a5T-£ 
S £ V 3^73 h'C^fifiliA 1 x A 2 xA 3 xA4 x A 5 

o ^- U 3*^7"^ K^KCILFOiils 4. 40 

6610x1. 170608x6. 004305x10. 639652 
x 3 . 898962xl0- 10 hif-J|£ftSo tfc, PL 
L L L L 0<g^t£& 136344. 3 4 x 1 0" 10 , t'J D173- h* C 
C C C 2 . 20xl0- 10 i:tt|E$*l5o 

KCCCCCTa&t). ^ U 3173- K &*M®£V Afn-F^ti 

F'LLLLLtfeSo 
7.3- v 7 ( 1 ) seMJKte*" h* (A) U 3 173~ K 

(011603 3 ) 
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Z ZX'teZOmMffl t IT, Gotoh©5S (Go 
t o h , 0. , Pattern matching of b i o 1 o g 
ical sequences with limited stor 
age, Cornput. Appl. Biosci. 3, 17-20, 1 
9 8 7 ) fc J; ZUfrmnCDT? 4 * > F0^37 (score) 

leixii^'M/f f (a) &nx.\±xv h* (B) 2: 

ISlXfi*' l J^7'? K (A) hl&lXli^Hrf (B) £© 

Si (l^i^m). B07^ ;&W,&}(DM£%LBb-§-Z>o £ © n — % 
^^7^> h Ol&mfr *> ff#£*i£ Ai:B £©FeO©;fgS#ffl©^3!]© 
^i^sum (S i) /LB bfcmtZo Z. © fl SI ft IS I £\ *iSfE 

(VTni:Bcl-2i:0ffiS^ffl) 

*B§®0 15 7 : H 7©^PS*2 (VT II) ft t h iz^^m^WB 
mg*&ZT&, ^©^ifii^<^fPot^^^ (S and v i g, 
K., et a 1 ., Exp. Med. Biol. 412, 225-23 
2, 1997 ;Paton, JC, and Paton, AW., CI 
in. Microbiol. Rev. 1 1, 450-479, 1998)o 

-^SWISS-PROT version 3 5^^ «JH»14 
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2<DT^J ^IB^* 5 m®* U rJ KTS 
bfc t h0§BjKfc##£*lS(DfciU 0 6 fc*®|fe8tM:E#jF£;h,TV> 
^i^i:, SWI SS-PEOT version 3 5 t^VMLC 
LLLs QRVAA, EFSGN, NWGR I 0 4ot$ (06if?: 
litF©SailiSWISS-PROT version 350S6 

£tiT V^ 5^11^ y ATn - K £ *i 3 £g 6 K «t> T* © T ^ S m.M&i> 
bstntZt, LCLLLO#Ittiil 5 0 0 1. 0 3 x 1 0 ~ 1 <\ £ 
Q R V A A N EFSGN, NWGR I t^VMli, ^tl^tl#^^i± 
15584. 5 5 xlO _1 V 3801. 6 5 xl0''\ 1479. 

t'Jn^/^KNWGRIIi^D mm 2£Bcl-2sBcl-xL, 
2 ( V T II) tBcl-2, Bcl-xLx MCL-lfD- tiJVT^ 
SP^lftfcT:; y ^ IE #1 ± t? ffi £ & "2 o ^iitx H3£#l 3© «fc o 

Bcl-2 (3 0. 0 + 2 7. 0 + 2 5. 0)/2 3 9 = 0. 3 43 
Bcl-xL (3 0. 0 + 2 9. 0 + 2 7. 0)/2 3 3 = 0. 3 6 9 
MCL-1 (3 4. 0 + 3 0. 0 + 2 8. 0 + 2 6. 0)/3 5 0 = 0. 3 3 7 

£*l£> 3~D<Dm&M(Do B c 1- 2 tB c l-xLtt@C77 ^ 
'J-li^UT^r^0 , r^ B c 1 - 2 < &U : B c 1-xLiMCL - 1 t 
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c 1 -x L ^&£ 0 
*B§MO 1 5 7 : H 7 D #5f| &^ n 2 ©ffet^nSJl 1 (V 

5 TI) i?7j V7t-Ai: lt#4t5o ^Dfllciftli^nil 

2 fcit^T^ <, ^D|12® 5 Ofrff) 1 IS^T'&S (Tesh, VL., 
et a 1 . , 1993, Infect. Immun. 61, 3392- 

3 4 0 2 )oI6Ir-^-7 SWISS-PROT version 
3 5 fc^^Ts ^D|| U7^ J if 5 M©* U 3^7? 
2>MMyZfcm& Ltc t h©g0IliP2Xl_HUMANt.fe!)s © 

7 2- K S S T LG<DWnm±, @4-e^^tifc^II^yA?3- K£ 

n^^geM^TcDT ^ j mm&frzmnt z ts 14385. 63x 

10- lo f$.t), NWGR I t tk^mm&&ffi 1 Oteffi^. 

3^73- h* NWGR I fc^jC-rS* U 3i7? ^NWGEL-CS&S^ 
£ft©#Mt£fct|g 4 £> 2 6 2 2 . 3 0 x 1 O" 10 !-^ t), NWGRI 

tjt^ts^o b c l - 2Mb c i-xl*7^ ;mmm&4m<D* 

V 3^73- H NWGR§^D«| 1 t&mTZ&. NWGRIiiNWG 
RL©#g#e> £> N Bcl-2Sl5Bcl - xLli^nll 1 «fc 

1 £B c 1- 2XiiB c 1 - xL h®U )\,7 ? J * > h©*£JH 
1 0 ) ip bmMZtlZ&M*) ^il^fi ( 2 7 . 0 + 2 6. 0)/239 
= 0. 2 2 2 , .2 6. 0/2 3 3 = 0. 1 1 2 (NWG IL fcffl 
l^ft^ * y &«B#IB8Iti&VO £&!) N ^DiiUBcl-2XliB 
c 1 - x L £ © ffi 5 f£ ffi B\ ^D|| 2i:Bc l-2XiiBc 1-xL 
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(mmm 5 ) 

(VTniBci-2 tom^m^mommmmm) 

^D#^2 £B c 1 - 2 tfHlifciffiSfE/B-r-S £ ££H^lfl£ii^bfc 
CE 1 1® (a) RVmi 1© (b) ^E^o b h 0ffFflt#> 

M (#*B c 1 - 2*£ : ?£$gJBLTt , »&lO HepG2. J&UZ(D 
fflmiZB c 1 - 2 M^^^-^IAIB c 1 - 2 £3§3Pit-3 «fc ? £ b 

£«b i o iz-D^Xs ^umm 2 ( v t n) -frwrn kbci- 

2tn;^(Bcl-2 I P s ) RVtfiVT nffi# ( V T II IPs) S 

mi 1 © (a) &m&&$Lmtkfe%i£ifiB c 1 - 2 jfttt&ffl^T^o: 
**>:TDy h##f Lfci£lil> 01 1© (a) ^mtZfriV T IUrbtt^JB 

fl§£:fc^TVTn£:Bcl-2 fcO^-pft^^^^tfe^tt^S CI 

tz, b i oi«t^D#jt i (vt i ) x\±^vmm2 (VT H) 

nii 2 (vt in & ^ h3>pj777^^3>*f, &#t£ts£*i3 cm 

110 (b) ^EDo 

ft^i: fcfr&ft&fcJfcfiEW&aifco 01 I© (b) £0t^0iS£3l£^ 

to 
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(mmm 6 ) 

^ D *i2 (vtii) c i - 2 ot $ ;wtmw<D±Mzm7rsL, 

1 2 fc^t 0 
(^Jfc#17) 

B c 1 - x L ©&#*ijt £ N g SR^ftHjgx-^ 
PDBtf0iiisti$n t ^§ o £ ©1fiii£^£B c 1 -x L(D± 

(mmm 8 ) 

ft*!^-^-;* PDB tg^^tLT* X> . i©iii^Sl:^Dt| 
20*tD^-tT'J >^£frl^ ^y"JViL^mit±-V, Bcl-xL 

^ffii^^T^ s MUfi-m&KDmmzm 8 ©d -*;i/77 ^ > h ©^ 
•(fifths 9 ) 

(vt JnzkzmmfzmmvNWGR 1 t&z&im) 

&fc£Mffl4 Jl ffi £ ti N VTIItB c 1 - 2 fc#it£LT V> 3 # 
i; ^^7-^ FT^^NWGE I # N VTHiiBcl-2 £©*§S#/B& 

©^-£ftB$£, NWGR I >j rf^^*^- f ? ©#^TtrfrV>> 
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«^>7DyhS (Far-Western blotting a 
nalysis) fc: <fc bMVfLtzo NWGR I ir U h*&. 
&#f$tVT II^B c 1-2 £©«1§r#:B$£II«Lfco It, B 1 0 
$BJ&£NWG R I & V rf^T*^ K (K 10. 5 0 N 1 0 O/iMtill 
U 1 Ong/mlOVTH^t.!: 2 >*fc 
£ £ *M % ^ £ «!l ;t bfco iirif^5 0 0 0|®|^Hoechst3 
3 3 4 2/PI (Pr opidium iodide) f^l!)^ 
U ^ © & 1? :p >X£jg;i LfcM© J±*£0 1 5t^bf;o ® 

^^ti^c^ vt n^^ommx-m 8 5%©;m#7> 

£££ NWGEU'J 3^73- K©ag&#lfltiM%©^#& 
$J£*ifco VTEkBc 1-2 fctf^tltl^tUd 

^^^feiNWGRIH;, VTIItB c 1-2 2: ©*S2#ffi fcJUftl 

\ 

10) 

(CD 4/g P12 0HIV1) 

t hx-fX^-r H I V 1 ttA^-TMti^t^o £ © 2: § 
^^J^gfOSSlg p 1 2 0 TUS^tSSSlCD 4 

£ £ ©3M^©JSI£©!I l|gpg£: ItiStfe^ itijs 

#froTV^3o #Hi£#lT-y:, gpl20iCD4 © iSg # fu IE ^ # 

^-Ur3^7-^ Ft^ib, MCD4fi*©^'j3^7'f K©T^y^ 

SB^J^^oSeS^SfiM^-^^-^ 'f-rtt^-r^ is SLWDQi: 
H V 3^73- Y %&%i~Z> W&Mt. Itgpl20 ^ilB^iiS 
1 6 © ( a Do a- U rf^T-^ KSLWD Qttx SWISS-PRO 
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T version 3 S^t-t hOIfilhltliffiSCD 4tL 

3 1 6 © (b )} 0 CD4i^^gpl20 tm&TZ&fc^ Z.(D* 
V rf^T*^ h* S L WD QOt<- N$im<Dm<D7 ^ J m^£7 ft *r - > 
(A r g) Rn 6 7-SFLTK GP — 7 3 #Sgfc«S!l SJIIfc 
Si h-ttftft^X V> 3 (Kwong, PD. et al., Nature, 
vol. 398, 6 48-659, 1 9 9 8), Sfc,gpl20*|*6 
CD4tg«t«fctx El 6 © (b) bfcffiEttOiS^ISi* (2 
89-KTI I VQLNET VK INC I EPNNKT — 3 1 0) ©f 

•efe 5 £ biP-frip oT^5 (Kwong, PD. et al., Natu 
re, vol. 398, 648-659, 1 9 9 8 ). Ufc#oTx gp 
1 2 0 CD 4 ©S-£#**IT2& ofe i: bt * liu IB ^ $1 # & T' ^ SI 3 

(SSJtefci li) 

(C E D - 4/MA C - 1 ) 

^4 (Caenorhabditis e legans) (C. ele 
g a n s ) y £4ft##HJ3 frfc£:ti;fc*l!J©#IBIfl&£$'Cfc3o 

£©C. e 1 e gans t-p^T©Ili^-^uuf^to IfilC 
E D - 4 tt7"D ^ A ilffl US ?E © f&HSU # >h80te®m*mrzto MA C - 

iftCED-4t^i, «?E£W;L3ge«£ it^^^n/c (w 

u et al., Development, vol. 126, 9, 20 
2 1-203 1, 1 9 9 9 ) 0 Lfc#ot\ MAC-lhCED-4 ©f£ 

u 3^7?- }*<D&m&%mitco ^(D^m. mac- i t ced-4& 
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5f©t V p F P S VE 3 Z. t ifimW V, #£§BJ 

ifitkU^flito C. elegans^;A*©7UI©it^ 

5> If- £5**1, * £ ©tf- U dtsiy? K©Jf#B: 5 . 4 3 6 T-&3, 0 

5 Ji&Of 6l0^fit^<^tt5*©Tfe?. (±t)©IE#J# C E D 

(HM 12) 
(APP/BASE) 
10 APP (amyloid precursor protein:!/ 7 

7-MBtfi;5 7^ D-T £©" * J?r © §r ffi m © 3 *>s 

TS. J 3i*%ffl&-&]mTZ>Bm (BASE)(bata secretas 

* 

15 e) ^IS^B^tifc (VASSAR et al., Science, 
286 (5440), 735-741, 1 9 9 9 ) c BASE£«fc3AP 

TQwrnit, ens, -ooieiriT-ii^^^^. c £ £j*j#-r3o 

*ISB§©tftffiE©fe.«)fes -O0ISIA PPtBASEto.^ 
T^-Urf^r^ K©##£&W^fco APP tBASEIiffi^fi©*^ 
20 7^i5fiOt>J^^^h'WYFD VJ:WYYEV§^bTl^o 
# y 3^73- F W Y F D Vfc\ SWISS-PEOT version 
35*ttF ©Mfig^ bTMeif AP P Cimtfs SfeWY 

25 -^©gfiR HT?ffl^&aJ^1SJi*BI^-r* (±flfl©E5!I#AP P x T 
MOM^WB A S E) 0 



I 
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(mmm 13) 

(VWF) N 7^7"^ >x1i#:C 3 ft £©#$^©j^S§( P a t hwa y : 



T- fc* 01*600 t lT7a-'J > i: V W 



F ^ © ffi 5 f£ JB © #1 £ iMf 3o VWFIi7a-'J <fc »3 7"l/i7--y-- 

WF©7l/^--y-MfiK©-®»rtii, £ tifc -o©SfiMP^T-ffiS^ 

ffi^'iJlT^^, ^:7a-'J>@|§7a-iJ 7071/*-1f-Ifi 

io K#«J#r£*LTi&l?km&KJ:& tJ 7Dr7-€ J: bT^t^T^ct 5 fcft 

3o ^ £T*. #%0J3©&iE©fc#>£ N 7a-iJ >71/A-U--IfilJ: 
VWF 7UA-^-Ig|0^ U rf^T*^ K©^^S^K^feo — -D(D 
M^Kii^-U =3^^-^ FHCPPG^ 7a-'J >71/*-t-I61 
T&6 13-6 1 7 ©&Sfc. VWF71/*-1f-ISStli 117 6 
15 - 1 1 8 0 Ottt^ot^S C!3 1 9 © ( a £:* *> £> ©{ittfcf&f!: 
Sfi^©^*^ t 3 0 t'J J^7f h'HCPP GttSWI S S — P R 
OT version 3 5tt h©gfili:LTii7j--'J>7l/A 



20 — 



25 



!) > <t t) 7 5 7iHS0&It- 7 6 3^76 4 © #12J#r £ ft 3 o 



- >7*l/A-t-iaiJ:VWF7l/A-t-SflIJ:0F^tD 
— A^T'^^^^hSfir^iix 7a-iJ>tJ;5VWF7I/A-^- 



<oi$^ : $frm.M'&$) % z t&frirz cmi 9© (b)) 0 -r^^^s 

ffi^0g&f, £ ££&ffi^©S&M*T*©1iO#r{iifi© 
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* 

^Sfl * ft o £ £ ifi rJ 16 & % 0 

* 

1 4 ) 
(APP^PC7) 

5 APP^?7lis D-T K /l/ij-t-yDH y (amy 1 o 
id precursor protein) (APP)tfT^D^ f*§ 

*>*o APP^l/^y&^fcSfc&o-SJBrfcp C 7 (proprote 
in convertase subtilisin/kexin t 
10 ype 7) #H#LT^& £ £&M%:%M,$ titz (L o p e z -P e 
rez E et a 1 J. Neurochem., vol.73, 
5, 2056-2062, 1 9 9 9 ). —o©geHAP PtP 

C 7 fco^T*- y KO&^ttfcBS'** APPtPC7liffi 

SttOB^fi^f 6 rT^r^ F'DSDPSGJ:DSDPNG$it 

15 f lTV^feot'J^^7 , 5 1 KDSDPSGIi,SWI SS-PROT v 
ersion 3 5 tffk h®f fili: ItlilfilAP P 1 
■fr-Ts lf;DSDPNG^Oth0ieili*Igt$!) N 

20 02O0HAPPC 6 8 7- KLVFFAEDVGS-6 9 70K 

J:L0F^APP 7;V7 r%&KZtztb<DWmwm~(:$> t>> ^nt^im 

(3 5 9— RMPEYAEECAS — 3 69) ^PC7fc# 

»rfc:§l#?- 3g£K£ s * SB 83 fc ft *8 "T 3 £ i: # qTfgTf £ * £ t: £ 

25 SbTV^o 
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tt#a±TO^«!|dSqrtgt ^ ^ fc®, two-hybr i d&<D£-5& 

ffiMti^t a*- u 3^7?- y commit mmiznfe-z 
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*'U^7-^K (B) Ifit^Stfe^t, 

l < smmmcDzr-p^-zftin&mLs 

( 3 ) tufa a h^m^tifc cx&d tomx-T ^ j mmm^-^^xcDu 

■r 

(4) d-£;i,:p^^ > l-oMs MgfiI*b<li*' , M7 , ^h- 

■ 

t/vOmD^s Atffisf^ffl-r^geM^ l < j±^u^r^ f (b) -?? 
3. ^0iai^,t<ii^>j^7^K (a) tmzftMtzm&mb l < 

T$>■^T^ 4>&<£fc&T© (a) fr£> (f) ©3M££#>L;fc£ 
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(f) fafBD — *^77^f^ > h©i&9lflU 22:0^6^7*— 
4. tt#©ttffl£3J®fcEtt©#g!fc;&n^ 4>£< fcfcJmT©^@£if 

7*^- K©ms©^^^^tf-#^n^m«ic^^t, Ktfefa^iifeas 

(i) HuiBA^) b<tt|ufB^m^^^B©^#:#ji* s ^©m^ Xtt 
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i£ ® B © IS 7 7 > h£*L3T$ >>8&#IB#l©4fljg£^-f 

< t & U T © § fe c * 4$fK t Z> J3SI# ; 
(k) ISKt- ^-^©^SfiK^tiufBAi: 

(1) VJ Ar-^^SBittsm 
1 0 . S3c©^HH3Jlfr£>|g9^©l>-f*Lfr H®fclB«©«BS*l##ffi*S 

7 s *- K (A) t«Ifffflt5J:W^ieiXli^'J^7'9 L h* (B) 

1 3. ff3c©fEffl|jtl lSXIill 2®£fB«©> *S:S#/9£3f 

1 4. af£©i£B0 1^X1*8 2iSfcfB«©^SI2r&&*tIJaL;fcN #>£©S 
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SJCXtt^U^r^ H (A) hg^SXii^U^T 0 ^ K (B) tolfi 

u^t-^k (a) tiaixii*" y ^.7°^ k (b) £<Dm&wmmizft 

4b£-«&©ffif*£*fc; y ^if>f >ltf f,*i^s ^©i&IXii 
*-y^7"^p (A) hiaiXH:*"»J^F (B) fcOSSKiaffiS 

1 8 . -<umm2 (vth) tBci-2 fc©ffi:sf£jfl&Bsg-r3«tE£# 
1 9 .i2?m©iE#i#-t 1 ti^-r^ smmmzm-rz* 1 ) 3^7? 

T^JWflsm&mtz**) 3^73- yh&^x, vtii^b c 1 - 2 t 
2 1 .itoiaif 1 8^x&iji2 o^t:i3*©^-y 317? y<dt^ ;m 

IBJOS^tr^y ^7"^- ot\ VTDJ:Bc 1-2 fc©ffiS^^ 

2 2.ft0^HIl 8^& L<£SS2 0^fcf3ft©^-y rf^r^KXtf/ 
Xttft;£©$Sffl^l 2 l3Sfci3<S©^y F§jRIJ§-r"*> VTIIkB 
c 1 - 2 t ©TOMSilt^IilttSft^Cl^S. 

2 3 . m>£<Dmmm 1 ® $ u < \±m 2m^m(D^m^m^itm^.<Dmmm 
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2 5 . K^o*gffl^2 4 mammon fr-sb^mfrz, mm^m-^t^m 



2 5 m lummox &T-m&.$ ntz. 



2 6 . If sfc© 



2 7 . tf;£© 



2 6^fcfBfS©&^fc>^a i £§£ft£^&^£-£> 



2 8 . 

2 9 . it^© 



2 7^tf3«©^^lr#e>nfc^i^t)o 



3 0 . 



2 9 3©£fam©#&T-#£ftfcseg7-n-k^>^# 
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(a) 

MKCILFKWVLCLLLGFSSVS 



•(b) 



(c) 



MKCIL 
KCILF 

CILFK 
ILFKW 



LFKWV 
FKWVL 
KWVLC 

WVLCL 
VLCLL 
LCLLL 



CLLLG 
LLLGF 
LLGFS 
LGFSS 
GFSSV 
FSSVS 



MKCILF 
KCILFK 
CILFKW 

ILFKWV 
LFKWVL 
FKWVLC 
KWVLCL 
WVLCLL 
VLCLLL 
LCLLLG 
CLLLGF 
LLLGFS 
LLGFS S 

LGFSSV 
GFSSVS 
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m 2 

MKCIL 
fib 

KCILF 

BETA-ADRENERGIC RECEPTOR KINASE 2 (ARK2_HUMAN) 
CILFK 

WHITE PROTEIN- HOMOLOG (WHIT_HUMAN) 

ILFKW 
fib 

LFKWV 
fib 

FKWVL 
fib 

KWVLC 
fib 

WVLCL 
VLCLL 

GROWTH ARREST AND DNA-DAMAGE-INDOCIBLE PROTEIN GADD45 
(GA45_HDMAN) 
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>Score = 32.0 
. ARK2 618 KCILFR 623 

****** 

VTII 2 KCILFK 7 

>Score = 27.0 

ARK2 651 FKEAQRLLRRA 661 
VTII 193 FRQIQREFRQA 203. 



>Score = 26.0 

ARK2 299 TEIILGLEHV 308 
. VTII 44 TEISTPLEHI 53 



>Score = 26.0 

ARK2 379 CMLFK 383 

•t* *J* *t* 

VTII 3 CILFK 7 



>Score =25.0 

ARK2 12 SYLMAMEKSKATPAARASKRI 32 

#* * * _ *** * 

VTII 160 SYLALMEFSGNTMTRDASRAV 180 
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4 



(a) (b) 



1 

L: 


144615 


L: 10.639652% 


A: 


128924 

* 


A: 9.485230% 


*— 

G: 


* 

100162 


G: 7.369144% 


V: 


95941 


V: 7.058596% 


I: 


81611 


I: 6.004305% 


S: 


79179 


S: 5.825378% 


E: 


78102 


E: 5.746140% 


R: 


75293 


R: 5.539476% 


T: 


73489 


T: 5.406752% 


D: 


69829 


D: 5.137477% 


Q: 


60214 


Q: 4.430080% 


P: 


60182 


P: 4.427726% 


K: 


59895 


K: 4.406610% 


N: 


53727 

• 


N: 3.952817% 


F: 


52995 


F: 3.898962% 

• 


M: 


38745 


M: 2.850557% 


Y: 


38721 


Y: 2.848791% 


H: 


30912 


H: 2.274266% 


W: 


20761 


W: 1.527434% 


C: 


15911 


C: 1.170608% 
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LCLLL 
QRVAA 
EFSGN 
NWGRI 



FASA_HUMAN 
TRAI.HUMAN • 
DAPOUMAN • 

BCL2(BC2A)_HUMAN, BCLX_HUMAN, MCLl.HUMAN 



>Score = 30.0 
Bcl-2 143 NWGRI 147 

T* 3jc jjc 3$C 3jc 

VTII. 223 NWGRI 227 



>Score = 27.0 



Bcl-2 



VTII 



129 RFATWEELFR 139 
182 RFVTVTAEALR 192 



>Score = 25.0 

Bcl-2 156 VMCVESVNREMSPLVDNIA 174 
VTCI 36 VSSLNSIRTEISTPLEHIS 54 
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a 8 



>Score - 30.0 



Bcl-xL 136 NWGRI 140 
VTII 223 NWGRI 227 . 



>Score = 29.0 

Bel -xL 101 YRRAFSDLTSQLHITPG 117 

= **** 

VTII 200 FRQALSETAPVYTMTPG 216 

>Score = 27.0 
Bcl-xL 5 NRELWDF 12 
VTII 22 SREFTIDF 29 
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>Score = 34.0 

Mcl -1 326 GGIRNVLLAFAGVAGVGAG 344 
VTII 225 GRISNVLPEYRGEDGVRVG 243 



>Score = 30.0 
Mcl-1 260 NWGRI 264 

sfe sfc ^# 

VTII 223 NWGRI 227 



>Score = 28.0 

Mcl-1 275 AKHLKTTNQES 285 

VTII 268 ARSVRAVNEES 278 



>Score = 26.0 

Mcl-1 134 LGKRPAVLPLLELVGESGNNTSTDGS 159 
VTII 152 ISRHSLVSSYLALMEFSGNTMTRDAS 177 
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3 1 0 



(a) 



>Score « 27.0 

Bcl-2 129 RFATWEELFR 139 

** ** * _* 

VTI 182 RFVTVTAEALR 192 



>Score = 26.0 

Bcl-2 143 NWGRI 147 

****_ 

VTI 224 NWGRL 228 



(b) 



>Score = 26.0 

Bel -xL 136 NWGRI 140 

VTI 224 NWGRL 228 
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012 




126 RFATWEELFR 136 

* 
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0 13 
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11 6 

(a) 

>Score = 29.0 

CD4 85 SLWDQ 89 

***** 

PP120 H5 SLWDQ 119 

(b) 

>Score = 26.0 

* 

CD4 27 KWLGKK6DTVELTCTASQKKS 48 

* *_ — ##* = * ~ * — 

GP120 289 KTIIVQLNETVKINCIRPNNKT 310 
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>Score = 30.0 

CED-4 158 HGRAGSGKSVIA 169 

** * **__ * 

MAC-1 245 HGPPGCGKTMFA 256 

>Score = 29.0 

CED-4 155 LFLHGRAGSGKSVIA 169 

MAC-1 571 ILLCGPPGCGKTLLA 585 

>Score =29.0 

CED-4 217 LLNFPSVE 224 

_ ****** 

MAC-1 698 FVDFPSVE 705 

>Score =27.0 

CED-4 93 FAINEPDLL 101 

* = * **** 

MAC-1 597 F5VKGPELL 605 

>Score = 26.0 

CED-4 68 LGPLIDFFNYNNQSHLADFLEDYIDFAINEPDLL 101 

** *** _ ** * _ *_* 

MAC-1 723 LGEDIDFHEIAQLPELAGFTGADLAVFIHEL5LL 756 

>Score = 26.0 

CED-4 486 EIGNNNVSVPERHIPSHFQKFRRS 509 

*= * * = ** * _ *** * * 

MAC-1 603 E L LNMYVG ES ERAVRTVFQRARDS 626' 

>Score «= 25.0 

CED-4. 301 FLEAYGMPM 309 

*** _ * * 

MAC-1 215 FLEVCRLAM 223 

* 

>Score = 25.0 

CED-4 . 428 DEVADRLKRLSK 439 

MAC-1 381 DAVDGRLRRTGR 392 
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118 

>Score = 34.0 

APP 253 EVEEEAEEP 261 

* *** *** 

BASE 46 ETDEEPEEP 54 

>Score = 27.0 

APP 307 WYFDV 311 

***** 

BASE 258 WYYEV 262 

>Score = 26.0 

APP 50 GKWDSD 55 

**** * 

BASE 135 GKWEGE 140 

119 

(a) 

>Score = 34.0 

furin-pre 613 HCPPG 617 

***** 

VWF-pre 1176 HCPPG 1180 



Cb) 

>Score = 25.0 

furin-pre 73 TKRSLSPHRP 82 
VWF-pre 761 SKRSLSCRPP 770 
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i>Score =30.0 

APP 449 ERQQLVETHMA 459 

***_.* *__ *_ 

PC7 624 DRQRLLESAMS 634 . 

>Score = 27.0 

APP 250 DGDEVEEEAEEP 261 

* **** *_ * 

PC7 730 DPDEVETESRGP 741 

>Score = 26.0 

APP 44 HMNVQN6K 51 

**** _** 

PC7 773 HLDVPH6K 780 

>Score = 26.0 

APP 53 DSDPSG 58 

PC7 563 D5DPNG 568 

>Score *= 26.0 

APP 755 NGYENPTY 762 

PC7 340 DGYANSIY 347 

m 

>Score "■= 26.0 

APP 440 ESLEQEA 446 

PC7 68 ETLEQQA 74 

>Score = 26.0 

APP 687 KLVFFAEDVGS 697 

** ***** * 

PC7 359 RMPFYAEECAS 369 

>Score = 25.0 

APP 246 EDDEDGDEVEEEAE 259 

* * * * *__** 

PC7 62 EGDGEEETLEQQAD 75 
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